Implementing a K-Tree in Python: Step-by-Step Tutorial

Implementing a K-Tree in Python: Step-by-Step TutorialA K-Tree is a generalized tree data structure where each node can have up to K children. Depending on K and how you use it, a K-Tree can model many structures: a binary tree (K=2), a ternary tree (K=3), an n-ary tree (K=n), or specialized indexed trees used in search, game trees, or spatial partitioning. This tutorial walks through implementing a flexible, well-tested K-Tree in Python, covering design choices, basic operations (insert, search, delete, traversal), performance considerations, and examples—ending with a small application.


Design choices and goals

Before writing code, decide what you need from your K-Tree:

  • Node capacity: fixed maximum K children per node.
  • Storage: store keys (and optionally values) in nodes.
  • Ordering: is this an ordered tree (children in key order) or an unordered tree? This guide implements an ordered K-Tree where each node keeps up to K keys and up to K+1 children, similar to a B-tree-like arrangement but without rebalancing.
  • Simplicity vs. features: this tutorial focuses on clarity and core operations (insert without rebalancing, search, delete simple cases, traversals). For production-grade balanced trees use established B-tree/B+ tree implementations or libraries.

API overview

We’ll implement:

  • Node class: stores keys, values, children.
  • KTree class: root node, K parameter, methods:
    • insert(key, value=None)
    • search(key) -> value or None
    • delete(key)
    • traverse(order=‘inorder’|‘preorder’|‘postorder’) -> generator
    • pretty_print() for visualization

This K-Tree stores multiple keys per node (up to K). It keeps keys in sorted order inside a node and maintains children pointers between keys, so each node can have up to K+1 children. Unlike a B-tree, this tutorial does not implement splitting or merging—insertions that would overflow a node will raise an exception or be handled by pushing into a child if appropriate. This simplifies implementation while still demonstrating K-Tree concepts.


Node structure

Each node holds:

  • keys: list of keys (sorted)
  • values: list of values aligned with keys (optional)
  • children: list of child Node references (length keys+1 or 0 for leaves)
  • is_leaf: boolean flag

Implementation

# ktree.py from typing import Any, List, Optional, Generator class KTreeNode:     def __init__(self, k: int, is_leaf: bool = True):         if k < 1:             raise ValueError("k must be >= 1")         self.k = k         self.keys: List[Any] = []         self.values: List[Any] = []         self.children: List['KTreeNode'] = []         self.is_leaf = is_leaf     def __repr__(self):         return f"KTreeNode(keys={self.keys}, leaf={self.is_leaf})" class KTree:     def __init__(self, k: int):         if k < 1:             raise ValueError("k must be >= 1")         self.k = k         self.root = KTreeNode(k, is_leaf=True)     def search(self, key: Any) -> Optional[Any]:         return self._search_node(self.root, key)     def _search_node(self, node: KTreeNode, key: Any) -> Optional[Any]:         # linear search within node keys (could be binary)         i = 0         while i < len(node.keys) and key > node.keys[i]:             i += 1         if i < len(node.keys) and key == node.keys[i]:             return node.values[i]         if node.is_leaf:             return None         return self._search_node(node.children[i], key)     def insert(self, key: Any, value: Any = None):         root = self.root         if len(root.keys) < self.k:             self._insert_non_full(root, key, value)         else:             # for simplicity: do not implement splitting — raise error             raise RuntimeError("Root is full. This simple K-Tree implementation "                                "does not support splits. Use larger k or implement splits.")     def _insert_non_full(self, node: KTreeNode, key: Any, value: Any):         i = len(node.keys) - 1         if node.is_leaf:             # find position to insert             insert_pos = 0             while insert_pos < len(node.keys) and node.keys[insert_pos] < key:                 insert_pos += 1             if insert_pos < len(node.keys) and node.keys[insert_pos] == key:                 node.values[insert_pos] = value  # replace existing                 return             node.keys.insert(insert_pos, key)             node.values.insert(insert_pos, value)             # no children to adjust for leaf         else:             # find child to descend             while i >= 0 and key < node.keys[i]:                 i -= 1             child_index = i + 1             child = node.children[child_index]             if len(child.keys) == self.k:                 # child full — in a full implementation we'd split; here we raise                 raise RuntimeError("Child is full. This simple implementation does not support splits.")             self._insert_non_full(child, key, value)     def traverse(self, order: str = 'inorder') -> Generator[Any, None, None]:         if order not in ('inorder', 'preorder', 'postorder'):             raise ValueError("order must be 'inorder', 'preorder', or 'postorder'")         yield from self._traverse_node(self.root, order)     def _traverse_node(self, node: KTreeNode, order: str):         if node.is_leaf:             if order == 'inorder':                 for v in node.values:                     yield v             elif order == 'preorder':                 for v in node.values:                     yield v             else:                 for v in node.values:                     yield v             return         if order == 'preorder':             # yield node keys/values first             for v in node.values:                 yield v             for child in node.children:                 yield from self._traverse_node(child, order)         elif order == 'inorder':             # children and keys interleaved             for i, key in enumerate(node.keys):                 yield from self._traverse_node(node.children[i], order)                 yield node.values[i]             # last child             if len(node.children) > len(node.keys):                 yield from self._traverse_node(node.children[-1], order)         else:  # postorder             for child in node.children:                 yield from self._traverse_node(child, order)             for v in node.values:                 yield v     def pretty_print(self):         lines = []         self._pretty(self.root, prefix="", is_tail=True, lines=lines)         for l in lines:             print(l)     def _pretty(self, node: KTreeNode, prefix: str, is_tail: bool, lines: List[str]):         lines.append(f"{prefix}{'└── ' if is_tail else '├── '}{node.keys}")         if not node.is_leaf:             for i, child in enumerate(node.children):                 self._pretty(child, prefix + ("    " if is_tail else "│   "), i == len(node.children) - 1, lines)     # Simple delete: only supports removing from leaves (no rebalancing)     def delete(self, key: Any) -> bool:         return self._delete_from_node(self.root, key)     def _delete_from_node(self, node: KTreeNode, key: Any) -> bool:         # find key in this node         for i, k in enumerate(node.keys):             if k == key:                 if node.is_leaf:                     node.keys.pop(i)                     node.values.pop(i)                     return True                 else:                     # complex case not implemented                     raise RuntimeError("Delete from internal nodes not implemented in this simple K-Tree.")         if node.is_leaf:             return False         # descend to correct child         i = 0         while i < len(node.keys) and key > node.keys[i]:             i += 1         return self._delete_from_node(node.children[i], key) 

Example usage

from ktree import KTree kt = KTree(k=3)  # each node can hold up to 3 keys kt.insert(10, "ten") kt.insert(5, "five") kt.insert(20, "twenty") print(kt.search(10))   # -> "ten" print(list(kt.traverse('inorder')))  # -> ["five", "ten", "twenty"] kt.pretty_print() 

Notes on limitations and extensions

  • This implementation intentionally omits node-splitting and merging, so it’s not a balanced tree and will raise errors when nodes fill. For a production-ready K-Tree (or B-tree), implement splitting on insert and merging/borrowing on delete.
  • Searching within nodes uses linear scan; for larger keys per node use binary search (bisect module).
  • Consider generic comparisons, duplicate-key handling, concurrency control, persistence, and disk-backed storage for large datasets.
  • If you want, I can extend this to a full B-tree style implementation with split/merge and detailed complexity analysis.

This tutorial provides a clear, minimal K-Tree implementation to understand core concepts and trial simple datasets.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *