Database Internalsmodule 3 of 8
Module 03 ~18 min · indexing · live builder

The B+ Tree
Finding a needle fast

Scanning 10 million rows to find user #47,392,185 would take forever. A B+ tree turns that search into about 23 comparisons — like binary-searching a physical book, one page at a time.

📖 Hold this picture

A phone book with index tabs on the edge. The tabs (internal nodes) don't contain phone numbers — they just tell you which section to flip to. The numbers (values) live only on the alphabetized pages at the end (leaf nodes), which are stapled in order so you can flip to "Smith" and keep reading forward — Thompson, Turner, Vance — without ever going back through the tabs.

◆ The key insight

A B+ tree gives you O(log n) lookups AND fast range scans from a single structure. That's why every major database — Postgres, SQL Server, SQLite, MySQL — uses it for primary-key indexes. Not a binary tree, not a hash table, not a B-tree. A B+ tree.

Why should you care?

When an AI says "just add an index," understand the cost: indexes aren't free. They speed up reads but slow down writes — every insert updates the tree. Knowing the shape of the tree lets you reason about that trade-off instead of cargo-culting indexes everywhere.

01

First: why not just a sorted list?

Before we build the tree, let's see why the obvious ideas fall apart. We want three things at once: fast lookups, cheap inserts, and fast ranges — on data far too big to fit in memory, where every disk trip is precious. Here's the scorecard.

sorted array

Fast to read, deadly to write

Binary search finds a key in O(log n) — great. But inserting in the middle means shifting every later element over by one. On millions of on-disk rows, a single insert rewrites the whole file. Insert = O(n). Fatal.

binary search tree

Cheap inserts, but it tips over

A BST inserts cheaply — until keys arrive in order and it degenerates into a linked list: O(n) again. Worse, it holds one key per node, so a deep tree means one slow disk read per comparison. Far too many trips.

B+ tree

All three, at once

Always balanced, so lookups stay O(log n) no matter the insert order. Each node is wide — dozens of keys — so the tree is shallow and disk reads are few. Inserts are local. And linked leaves make ranges trivial. The winner.

◆ The trick in one sentence

A B+ tree stays perfectly balanced by growing upward from the root instead of downward from the leaves — so every key is always the same short distance from the top.

02

The anatomy of a B+ tree

Here is a small, real order-4 tree. Study its parts — every B+ tree on earth, including the billion-row ones, is built from exactly these pieces.

A small order-4 treeviolet = internal · aqua = leaf
1
Two kinds of node

Internal nodes (violet) hold only routing keys — signposts. Leaf nodes (aqua) hold the real keys and their values.

2
Keys are always sorted

Within every node, keys ascend left to right. A node with k routing keys has exactly k + 1 children — one for each gap between and around the keys.

3
Data lives only in leaves

You will never find a stored value in an internal node. The top of the tree is pure navigation; the bottom is pure data. (A routing key may repeat a leaf key as a signpost.)

4
Every leaf is the same depth

The tree is balanced by construction — no lopsided branches. That is what guarantees the same fast O(log n) cost for every key.

5
Leaves are linked in order

Each leaf points to its next neighbour (the coral arrows). Reach the first match, then walk this chain sideways — that is what makes range scans cheap.

6
"Order" = how wide a node can get

This tree is order-4: max 3 keys, 4 children per node. Exceed that and the node splits. Real databases use fanout in the hundreds, so the tree stays just 3–4 levels deep even with billions of rows.

03

How a search works

Searching is the easy operation — it shows how the signposts route you. The rule at each internal node: find the first key larger than your target and follow the pointer just before it. Step through a real lookup for the key 45.

Guided search · find 45

The same descent, written as plain steps:

Start at the root

Always. A search never moves sideways across a level — only down.

Pick a child by comparing

Compare your target to the node's routing keys, and follow the one pointer that bounds it. One comparison, one level down.

Repeat until you hit a leaf

Each hop halves (or better) the search space. A tree of millions is only a handful of hops deep.

Binary-search the leaf

Inside the final leaf, binary-search its handful of keys. Match → return the value. No match → the key isn't in the tree.

04

How an insert works — and the all-important split

Inserting is just "search for where the key belongs, then put it there." The interesting part is what happens when the target leaf is already full. That's the split — the single mechanism that keeps the whole tree balanced. Here it is in three frames:

1 · target leaf is full
3 keys = the max. Now 40 arrives…
2 · overflow: 4 > 3
Too many keys. The leaf must split.
3 · split & copy up
Two half-leaves; 30 copied up as a signpost.

Copy up vs. move up — the one subtlety. When a leaf splits, its separator key is copied upward (the value must stay in the leaf). When an internal node splits, the middle key moves up (no data to keep). Either way, if that push fills the parent, the parent splits too — and the cascade can reach the root, creating a new one. That is the only way the tree gets taller.

Now watch it accumulate. Step through inserting 10, 20, 30 … 100 into an empty order-4 tree and watch leaves fill, split, and finally push a new root into existence:

Guided insert · build a tree from scratch
05

How a delete works — borrowing and merging

Deleting is the mirror image of inserting. A split happens when a leaf gets too full; the opposite danger is a leaf getting too empty. Every leaf must keep at least ⌈order/2⌉ keys — for our order-4 tree, a minimum of 2. Drop below that and the tree has two ways to heal itself: borrow a key from a neighbour, or merge two leaves into one. Here's the decision in three frames:

1 · underflow: 1 < 2
Deleting 10 would leave this leaf with too few keys.
2a · a sibling can spare? borrow
Take one key from a rich neighbour; fix the separator.
2b · nobody can spare? merge
Fuse two thin leaves into one; the parent loses a key.

Borrow first, merge only if you must. Borrowing is cheaper — it touches three nodes and rewrites one separator. Merging removes a key from the parent, which can underflow it too. So the rule is: try to borrow from a sibling that has keys to spare; merge only when both neighbours are already at the minimum.

Now watch a merge cascade. Step through deleting keys from a full 3-level tree and watch leaves borrow, merge, and finally collapse the root — shrinking the tree from the top, the exact inverse of how a split grew it:

Guided delete · borrow, merge & shrink
06

Now you try: the live playground

You've seen every move — now drive it yourself. This is a fully working order-4 tree (max 3 keys, min 2 per leaf). Insert to watch leaves split and the tree grow; delete to watch leaves borrow, merge, and the tree shrink. Then search for a key, or run a range scan and watch it walk the linked leaves sideways. Try inserting 1, 2, 3, 4, 5… in order to force splits, then delete them back down to trigger merges.

Live B+ Tree
internal node (routing keys) leaf node (values) leaf-link pointer (next →) range-scan path
07

The shape of a node, in real code

BPlusTree/BPlusTreeNode.cs
public abstract class BPlusTreeNode
{
    public bool IsLeaf { get; protected set; }
    public List<object> Keys { get; protected set; }
}

public class BPlusTreeInternalNode : BPlusTreeNode
{
    public List<BPlusTreeNode> Children { get; }
}

public class BPlusTreeLeafNode : BPlusTreeNode
{
    public List<object?> Values { get; }
    public BPlusTreeLeafNode? Next { get; set; }
    public BPlusTreeLeafNode? Previous { get; set; }
}
Two kinds of node

Every node has Keys — the sorted values used to route or to find.

An internal node adds Children — pointers down. No values; it only routes.

A leaf node adds Values (the real data) plus Next and Previous — the staples that link leaves into a doubly-linked chain.

08

Search: walk down, then binary-search the leaf

BPlusTree/BPlusTree.cs — Search
public object? Search(object key)
{
    lock (_lockObject)
    {
        var leaf = FindLeafNode(key);
        int index = FindKeyIndex(leaf.Keys, key);

        if (index < leaf.Keys.Count &&
            _comparer.Compare(leaf.Keys[index], key) == 0)
        {
            return ((BPlusTreeLeafNode)leaf).Values[index];
        }
        return null;
    }
}
Two phases

FindLeafNode — start at the root and follow routing keys down to exactly one leaf. This is the "climb down the tabs" step.

FindKeyIndex — inside that leaf, binary-search the sorted keys.

If the key at that index matches, return its value; otherwise it isn't in the tree. The whole thing is locked so a concurrent split can't move the ground under us (Module 6).

09

Delete: remove, then heal the tree

The deletion you watched above is this exact code. Removing the key is the easy line; the interesting work is the underflow check and the Rebalance call beneath it.

BPlusTree/BPlusTree.cs — Delete
public bool Delete(object key)
{
    lock (_lockObject)
    {
        var leaf = FindLeafNode(key);
        int index = FindKeyIndex(leaf.Keys, key);

        if (index < leaf.Keys.Count &&
            _comparer.Compare(leaf.Keys[index], key) == 0)
        {
            leaf.Keys.RemoveAt(index);
            leaf.Values.RemoveAt(index);

            if (leaf == _root) return true;

            if (leaf.KeyCount < MinLeafKeys())
                RebalanceLeafNode(leaf);

            return true;
        }
        return false;   // key wasn't there
    }
}

private int MinLeafKeys()     => _order / 2;
private int MinInternalKeys() => ((_order + 1) / 2) - 1;
Remove, then check the floor

FindLeafNode + FindKeyIndex — the same descent as a search, to locate the key's leaf.

RemoveAt — drop the key and its value. If this leaf is the root, we're allowed to underflow — stop here.

KeyCount < MinLeafKeys() — the floor check. For order-4, MinLeafKeys = 4/2 = 2. Fall below it and RebalanceLeafNode runs the borrow-or-merge logic you saw — possibly cascading up via RebalanceInternalNode and, at the very top, collapsing a one-child root to shrink the tree.

Why a minimum at all? Without a floor, repeated deletes would leave leaves nearly empty and the tree tall and sparse — wasting disk reads on mostly-blank pages. The minimum-fill rule guarantees every node stays at least half-full, which keeps the tree shallow and read-efficient even after heavy churn.

10

Range: find the start, then walk the chain

BPlusTree/BPlusTree.cs — Range
public IEnumerable<KeyValuePair<object, object?>>
    Range(object? minKey, object? maxKey)
{
    var results = new List<KeyValuePair<object, object?>>();
    var leaf = minKey != null ? FindLeafNode(minKey)
                              : GetFirstLeaf();

    while (leaf != null)
    {
        foreach (var key in leaf.Keys)
        {
            if (key < minKey) continue;
            if (key > maxKey) return results;   // done
            results.Add(/* key, value */);
        }
        leaf = leaf.Next;   // walk sideways
    }
    return results;
}
The leaf-chain trick

Find the leaf where minKey would live — one O(log n) descent.

Then leaf = leaf.Next walks straight along the stapled chain, collecting keys, until one exceeds maxKey.

No going back up through the tabs. This is why B+ trees crush plain B-trees for queries like WHERE age BETWEEN 20 AND 30.

Why B+ and not B? In a plain B-tree, values live at every level. In a B+ tree, values live only in leaves, and leaves are linked. That turns a range scan from a full tree traversal into a straight sideways walk.

Fanout = how wide each node is. The toy above uses order-4 nodes (3 keys, 4 children) so splits happen often and you can see them. Real databases use fanout of 100+. Higher fanout means a shallower tree, which means fewer disk reads per lookup.

11

Check yourself

Scenario
"Find all users with ID between 1000 and 2000." Which operation runs, and what does it cost?
Correct. Range costs one descent plus a walk of k matched leaves — O(log n + k). Neither a full scan nor one-at-a-time lookups.
Tracing
You delete a key and its leaf drops to 1 key (below the minimum of 2). Its left sibling holds 3 keys. What happens?
Correct. Because the sibling has more than the minimum (3 > 2), it can spare one. Borrowing is preferred over merging — it's cheaper and never underflows the parent. Try it in the playground: build a tree, then delete down to an underflow.
Scenario
A leaf underflows and both neighbours are already at the minimum. What now — and what can it trigger?
Exactly. No neighbour can spare a key, so two thin leaves fuse into one. The parent loses a key and may itself underflow, rebalancing up the tree. If that empties the root, its lone child becomes the new root — the tree shrinks from the top, the mirror image of a split.
Debugging
After many inserts the tree has grown very tall and queries feel slower. What changed?
Right. Each split can add height. Higher tree = more hops per lookup. With higher fanout, real databases stay shallow even with billions of rows.
Architecture
Leaves have both Next and Previous pointers, but range scans only use Next. Why keep Previous?
Exactly. A doubly-linked leaf chain lets you walk backward too — handy the day someone needs descending order without re-sorting.
Tracing
Insert 5, 15, 25, 35 into an order-3 tree (max 2 keys per leaf). What happens?
Correct. Once a leaf exceeds its max, it splits and pushes a key up — which can overflow the parent and split again. Set the toy going with these keys and watch the root appear.

Up next: the tree exists — but when does the engine actually use it? You'll meet the query planner that decides, for every query, whether to consult the index or fall back to a full scan.