Implementing a K-Tree in Python: Step-by-Step TutorialA K-Tree is a generalized tree data structure where each node can have up to K children. Depending on K and how you use it, a K-Tree can model many structures: a binary tree (K=2), a ternary tree (K=3), an n-ary tree (K=n), or specialized indexed trees used in search, game trees, or spatial partitioning. This tutorial walks through implementing a flexible, well-tested K-Tree in Python, covering design choices, basic operations (insert, search, delete, traversal), performance considerations, and examples—ending with a small application.
Design choices and goals
Before writing code, decide what you need from your K-Tree:
- Node capacity: fixed maximum K children per node.
- Storage: store keys (and optionally values) in nodes.
- Ordering: is this an ordered tree (children in key order) or an unordered tree? This guide implements an ordered K-Tree where each node keeps up to K keys and up to K+1 children, similar to a B-tree-like arrangement but without rebalancing.
- Simplicity vs. features: this tutorial focuses on clarity and core operations (insert without rebalancing, search, delete simple cases, traversals). For production-grade balanced trees use established B-tree/B+ tree implementations or libraries.
API overview
We’ll implement:
- Node class: stores keys, values, children.
- KTree class: root node, K parameter, methods:
- insert(key, value=None)
- search(key) -> value or None
- delete(key)
- traverse(order=‘inorder’|‘preorder’|‘postorder’) -> generator
- pretty_print() for visualization
This K-Tree stores multiple keys per node (up to K). It keeps keys in sorted order inside a node and maintains children pointers between keys, so each node can have up to K+1 children. Unlike a B-tree, this tutorial does not implement splitting or merging—insertions that would overflow a node will raise an exception or be handled by pushing into a child if appropriate. This simplifies implementation while still demonstrating K-Tree concepts.
Node structure
Each node holds:
- keys: list of keys (sorted)
- values: list of values aligned with keys (optional)
- children: list of child Node references (length keys+1 or 0 for leaves)
- is_leaf: boolean flag
Implementation
# ktree.py from typing import Any, List, Optional, Generator class KTreeNode: def __init__(self, k: int, is_leaf: bool = True): if k < 1: raise ValueError("k must be >= 1") self.k = k self.keys: List[Any] = [] self.values: List[Any] = [] self.children: List['KTreeNode'] = [] self.is_leaf = is_leaf def __repr__(self): return f"KTreeNode(keys={self.keys}, leaf={self.is_leaf})" class KTree: def __init__(self, k: int): if k < 1: raise ValueError("k must be >= 1") self.k = k self.root = KTreeNode(k, is_leaf=True) def search(self, key: Any) -> Optional[Any]: return self._search_node(self.root, key) def _search_node(self, node: KTreeNode, key: Any) -> Optional[Any]: # linear search within node keys (could be binary) i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 if i < len(node.keys) and key == node.keys[i]: return node.values[i] if node.is_leaf: return None return self._search_node(node.children[i], key) def insert(self, key: Any, value: Any = None): root = self.root if len(root.keys) < self.k: self._insert_non_full(root, key, value) else: # for simplicity: do not implement splitting — raise error raise RuntimeError("Root is full. This simple K-Tree implementation " "does not support splits. Use larger k or implement splits.") def _insert_non_full(self, node: KTreeNode, key: Any, value: Any): i = len(node.keys) - 1 if node.is_leaf: # find position to insert insert_pos = 0 while insert_pos < len(node.keys) and node.keys[insert_pos] < key: insert_pos += 1 if insert_pos < len(node.keys) and node.keys[insert_pos] == key: node.values[insert_pos] = value # replace existing return node.keys.insert(insert_pos, key) node.values.insert(insert_pos, value) # no children to adjust for leaf else: # find child to descend while i >= 0 and key < node.keys[i]: i -= 1 child_index = i + 1 child = node.children[child_index] if len(child.keys) == self.k: # child full — in a full implementation we'd split; here we raise raise RuntimeError("Child is full. This simple implementation does not support splits.") self._insert_non_full(child, key, value) def traverse(self, order: str = 'inorder') -> Generator[Any, None, None]: if order not in ('inorder', 'preorder', 'postorder'): raise ValueError("order must be 'inorder', 'preorder', or 'postorder'") yield from self._traverse_node(self.root, order) def _traverse_node(self, node: KTreeNode, order: str): if node.is_leaf: if order == 'inorder': for v in node.values: yield v elif order == 'preorder': for v in node.values: yield v else: for v in node.values: yield v return if order == 'preorder': # yield node keys/values first for v in node.values: yield v for child in node.children: yield from self._traverse_node(child, order) elif order == 'inorder': # children and keys interleaved for i, key in enumerate(node.keys): yield from self._traverse_node(node.children[i], order) yield node.values[i] # last child if len(node.children) > len(node.keys): yield from self._traverse_node(node.children[-1], order) else: # postorder for child in node.children: yield from self._traverse_node(child, order) for v in node.values: yield v def pretty_print(self): lines = [] self._pretty(self.root, prefix="", is_tail=True, lines=lines) for l in lines: print(l) def _pretty(self, node: KTreeNode, prefix: str, is_tail: bool, lines: List[str]): lines.append(f"{prefix}{'└── ' if is_tail else '├── '}{node.keys}") if not node.is_leaf: for i, child in enumerate(node.children): self._pretty(child, prefix + (" " if is_tail else "│ "), i == len(node.children) - 1, lines) # Simple delete: only supports removing from leaves (no rebalancing) def delete(self, key: Any) -> bool: return self._delete_from_node(self.root, key) def _delete_from_node(self, node: KTreeNode, key: Any) -> bool: # find key in this node for i, k in enumerate(node.keys): if k == key: if node.is_leaf: node.keys.pop(i) node.values.pop(i) return True else: # complex case not implemented raise RuntimeError("Delete from internal nodes not implemented in this simple K-Tree.") if node.is_leaf: return False # descend to correct child i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 return self._delete_from_node(node.children[i], key)
Example usage
from ktree import KTree kt = KTree(k=3) # each node can hold up to 3 keys kt.insert(10, "ten") kt.insert(5, "five") kt.insert(20, "twenty") print(kt.search(10)) # -> "ten" print(list(kt.traverse('inorder'))) # -> ["five", "ten", "twenty"] kt.pretty_print()
Notes on limitations and extensions
- This implementation intentionally omits node-splitting and merging, so it’s not a balanced tree and will raise errors when nodes fill. For a production-ready K-Tree (or B-tree), implement splitting on insert and merging/borrowing on delete.
- Searching within nodes uses linear scan; for larger keys per node use binary search (bisect module).
- Consider generic comparisons, duplicate-key handling, concurrency control, persistence, and disk-backed storage for large datasets.
- If you want, I can extend this to a full B-tree style implementation with split/merge and detailed complexity analysis.
This tutorial provides a clear, minimal K-Tree implementation to understand core concepts and trial simple datasets.
Leave a Reply