Introduction

Reminder on Red-black trees (brief…)
A red-black tree is a binary tree, satisfies the red-black properties
1. Every node is either red or black
2. The root is black
3. Every leaf (NIL) is black
4. If a node is red, then both its children are black
5. For each node: all simple paths from node to descendant leaves contain the same number of black nodes
Black height of a node
- The number of black nodes on any simple path from, but not including, a node \(x\) down to a leaf
- \(bh(x)\)
Red black trees make good search trees
- Lemma 13.1, “A red-black tree with \(n\) internal nodes has height at most \(2 \lg(n+1)\)”
Operations
- Rotation, \(O(1)\)
- Insertion, \(O(\lg n)\)
- Deletion, \(O(\lg n)\)

Introduction

When do we use a B-tree?
1. Keeps records in sorted order for sequential traversing
2. Uses a hierarchical index to minimize number of disk reads
3. Uses partially-full blocks to speed insertions and deletions
4. Index is elegantly adjusted with recursive algorithm
Additional constraints, to ensure the tree is always balanced

Introduction

B-trees vs red-black trees
- B-tree nodes may have many children (from a few to thousands)
  - Branching factor can be large
  - Depends on characteristics of the disk unit
- Every \(n\)-node B-tree has height \(O(\lg n)\), as in red-black trees
- Height of B-tree can be much less than that of a red-black tree
  - Because of its branching factor
  - Base of logarithm expressing its height can be much larger
- Can use B-trees to implement many dynamic-set operations in time \(O(\lg n)\)

Introduction

B-trees generalize binary search trees in natural manner
- If internal node \(x\) contains \(x.n\) keys, then \(x\) has \(x.n+1\) children
- Keys in node \(x\) are dividing points separating the range of keys handled by \(x\) into \(x.n +1\) subranges
  - Each subrange handled by a child of \(x\)
- When searching for a key
  - We make an \((x.n + 1)\)-way decision based on comparisons with the \(x.n\) keys stored in \(x\)

Introduction

A b-tree

Data Structures on Secondary Storage

Computer systems use technology for memory capacity
- Main memory
  - Silicon memory chips
  - More expensive per bit stored than magnetic storage technology (tapes or disks)
- Secondary storage
  - Based on magnetic disks
  - Amount of this storage often exceeds the amount of primary memory (by two or more orders of magnitude)

Data Structures on Secondary Storage

Typical disk drive

Data Structures on Secondary Storage

Disks are cheaper than memory but they are much, much slower
- Because they move mechanical parts
  - Platter rotation
  - Arm movement
- Disks rotate at speeds of 5,400 - 15,000 RPM
  - 1 rotation takes 8.33 milliseconds (5 orders of magnitude longer than silicon memory)
  - Silicon memory access takes about 50 nanoseconds
  - In a full rotation we could access main memory more than 100,000 times

Data Structures on Secondary Storage

Amortize the time spent waiting for mechanical movements
- Disk access more than 1 item at a time
- Information divided into equal-sized pages of bits
  - Appear consecutively within tracks
  - A disk read or write is of one or more entire pages
  - Typical disk page is \(2^{11}\) to \(2^{14}\) bytes long
- Once read/write head positioned
  - Reading or writing is entirely electronic (aside from rotation of disk)
  - Disk can read or write large amounts of data quickly
- Often, accessing a page of information and reading it from disk takes longer than examining all information read
- Look separately two principal components of the running time
  - The number of disk accesses
    - In terms of number of pages of information (read/written to disk)
  - The CPU (computing) time

Data Structures on Secondary Storage

A typical B-tree application
- Amount of data too large
- Not all data fit into memory
- B-tree algorithms copy selected pages from disk to memory
  - As needed
  - Write back into disk pages that change
- Keep only constant number of pages in memory (at any time)
- Size of memory doesn’t limit size of B-trees that can be handled

Data Structures on Secondary Storage

Disk operations
- Let \(x\) be a pointer to an object
- If object already in main memory
  - \(x.key\)
- If object in disk
  - DISK-READ(\(x\)), to read object into main memory, if object already in memory, no DISK-READ operation performed
  - DISK-WRITE(\(x\)), to save changes made to attributes of object \(x\)

Data Structures on Secondary Storage

Because system keeps limited number of pages in main memory at any time
- System flushes from memory pages no longer in use
- Our B-tree algorithm ignores this fact
B-tree node is usually as large as a whole disk page
- In order to read/write as much information as possible, as a whole disk page
- Read/write too expensive
- This size limits the number of children a B-tree node can have
Large B-tree stored on a disk
- Branching factors from 50 to 2,000
- Depending on size of key relative to size of a page
- Large branching factor
  - Reduces height of tree and number of disk accesses to find any key

Data Structures on Secondary Storage

Definition of B-trees

Assume that satellite information associated with a key resides in the same node as the key
- \(B^{+}-tree\) are different, they store satellite information in leaves and stores only keys and child pointers in internal nodes
  - Maximize branching factor of internal nodes

Definition of B-trees

A B-tree \(T\) is a rooted tree (with root \(T.root\)) with the following properties
1. Every node \(x\) has the following attributes:
  1. \(x.n\), the number of keys currently stored in node \(x\)
  2. the \(x.n\) keys themselves, \(x.key_1 \leq x.key_2, \dots, x.key_{x.n}\), stored in nondecreasing order, so that \(x.key_1 \leq x.key_2 \dots \leq x.key_{x.n}\),
  3. \(x.leaf\), a boolean value that is TRUE if \(x\) is a leaf and FALSE if \(x\) is an internal node
2. Each internal node \(x\) also contains \(x.n + 1\) pointers \(x.c_1, x.c_2, \dots, x.c_{x.n+1}\) to its children. Leaf nodes have no children (\(c_i\) attributes undefined)
3. The keys \(x.key_i\) separate the ranges of keys stored in each subree: if \(k_i\) is any key stored in the subtree with root \(x.c_i\), then
  - \(k_1 \leq x.key_1 \leq k_2 \leq x.key_2 \leq \dots \leq x.key_{x.n} \leq k_{x.n+1}\).
4. All leaves have the same depth, which is the tree’s height \(h\).
5. Nodes have lower and upper bounds on the number of keys they can contain. We express these bounds in terms of a fixed integer \(t \geq 2\) called the minimum degree of the B-tree
  1. Every node other than the root must have at least \(t-1\) keys.
    - Internal nodes at least \(t\) children.
    - If tree is nonempty, the root must have at least one key.
  2. Every node may contain at most \(2t-1\) keys.
    - An internal node may have at most \(2t\) children
    - A node is full if it contains exactly \(2t-1\) keys.
Simplest B-tree
- When \(t = 2\)
- Internal nodes have 2, 3, or 4 children
- A 2-3-4 tree

The Height of a B-tree

Number of disk accesses in a B-tree is proportional to the height of the B-tree
- Worst-case height of a B-tree!
Theorem 18.1
- If \(n \geq 1\), then for any \(n\)-key B-tree \(T\) of height \(h\) and minimum degree \(t \geq 2\), \(h \leq \log_t \frac{n+1}{2}\).
Proof
- The root of a B-tree \(T\) contains at least one key
- All other nodes contain at least \(t-1\) keys
- \(T\) with height \(h\) has at least 2 nodes at depth 1
  - At least \(2t\) nodes at depth 2
  - At least \(2t^2\) at depth 3, and so on
  - At depth \(h\) it has at least \(2t^{h-1}\) nodes

The number \(n\) of keys satisfies the inequality
- \(n \geq 1 + (t-1) \sum\limits_{i=1}^{h}2t^{i-1}\)
- \(= 1 + 2(t-1) ( \frac{t^h - 1}{t - 1} )\)
- \(=2t^h - 1\)
We get \(t^h \leq (n+1)/2\) (by simple algebra)
- We take base-\(t\) logarithms to both sides and prove the theorem.
B-trees are powerful
- Save factor of about \(\lg t\) over red-black trees
  - In number of nodes examined for most tree operations
- Avoid substantial number of disk accesses

Basic Operations on B-trees

Conventions
- The root of the B-tree is always in main memory
  - No DISK-READ on the root
  - We do DISK-WRITE when root node changes
- Nodes passed as parameters must already have had a DISK-READ
Procedures are all “one-pass” algorithms
- Proceed downward from the root
- Don’t back up

Basic Operations on B-trees

Searching a B-tree
- Similar to searching a binary search tree
- Instead of making a binary branching decision at each node
  - We make a multiway branching decision
  - According to the node’s children
  - An (\(x.n+1\))-way branching decision

B-TREE-SEARCH accesses \(O(h) = O(\log_t n)\) disk pages
- \(h\) is the height of the B-tree
- \(n\) is the number of keys in the B-tree
Because \(x.n < 2t\)
- While loop (lines 2-3) takes \(O(t)\) time in each node
- Total CPU time is \(O(th) = O(t \log_t n)\).

Basic Operations on B-trees

Creating an empty B-tree
- We create an empty root node
- Then we can insert keys into that tree
- Require the ALLOCATE-NODE procedure
  - Allocates a disk page to be used as a new node, \(O(1)\) time
  - Requires no DISK-READ

Basic Operations on B-trees

Inserting a key into a B-tree
- More complicated than inserting a key in a binary tree
- Search for the leaf position to insert the new key
- Can’t simply create new leaf node and insert it, would fail to have a valid B-tree
- We insert the new key into an existing leaf node
  - We cannot insert if node is full
  - Introduce new operation that splits a full node
    - Full node \(y\) has \(2t-1\) keys
    - Split around the median key \(y.key_t\) into two nodes with only \(t-1\) keys each
    - Median key moves up into \(y\)’s parent, but if \(y\)’s parent is full, we must split it… and so on
  - We insert in a single pass down the tree from the root to a leaf
    - Don’t wait to find out if we need to split a full node
    - As we travel down the tree, we split each full node we find (including the leaf)
    - So, when we split a full node \(y\), we know its parent is not full

Basic Operations on B-trees

Splitting a Node in a B-tree
- B-TREE-SPLIT-CHILD
  - Input: nonfull internal node \(x\) and an index \(i\) such that \(x.c_i\) is a full child of \(x\)
  - Splits this child in two and adjusts \(x\) to have an additional child
  - To split the root
    - First make the root a child of a new empty root node
    - Then we can use B-TREE-SPLIT-CHILD
  - The tree only grows through splitting

B-TREE-SPLIT-CHILD works as follows
- \(x\) is the parent of the node being split
- \(y\) is \(x\)’s \(i^{th}\) child (the node being split)
- Node \(y\) had originally \(2t\) children and \(2t-1\) keys
  - Was reduced to \(t\) children and \(t-1\) keys
- Node \(z\) takes the \(t\) largest children, \(t-1\) keys, from \(y\)
- \(z\) becomes a new child of \(x\)
  - Positioned after \(y\) in \(x\)’s table of children
- Median key of \(y\) moves up and becomes a key in \(x\), it separates \(y\) and \(z\)

Basic Operations on B-trees

Inserting a Key into a B-tree in a Single Pass Down the Tree
- A single pass down the tree
- \(O(h)\) disk accesses
- CPU required is \(O(th) = O(t \log_t n)\)
- Uses B-TREE-SPLIT-CHILD
  - Guarantee recursion never descends to a full node

Lines 3-9, case in which root node \(r\) is full
- Root splits and new node \(s\) becomes the root
- Splitting the root is the only way to increase the height of a B-tree
  - Increases height at the top, instead of at the bottom as happens in a binary search tree

B-TREE-INSERT-NONFULL
- Lines 3-8: case in which \(x\) is a leaf node, insert \(k\) into \(x\)
- If \(x\) isn’t a leaf, we insert \(k\) into appropriate leaf node in subtree rooted at internal node \(x\) (lines 9-11)
- Line 13, recursion descended into a full child, line 14 uses B-TREE-SPLIT-CHILD
- Lines 15-16 finds to which child the alg. should descend to
- Line 17 recurses to insert \(k\) into appropriate subtree
Complexity
- B-tree of height \(h\), B-TREE-INSERT performs \(O(h)\) disk accesses
- CPU: \(O(th) = O(t \log_t n)\)

Deleting a Key from a B-tree

Deletion analogous to insertion
- But more complicated
- We can delete from any node, not just a leaf
- When we delete a key from an internal node
  - We have to rearrange the node’s children
  - The resulting tree should preserve the B-tree properties
    - A node shouldn’t get too small during deletion
    - Except the root, which is allowed to have fewer than the minimum of \(t-1\) keys

Deletion cases:
1. If key \(k\) is in node \(x\) and \(x\) is a leaf, delete the key \(k\) from \(x\)
2. If key \(k\) is in node \(x\) and \(x\) is an internal node:
  - If the child \(y\) that precedes \(k\) in node \(x\) has at least \(t\) keys
    - then find the predecessor \(k'\) of \(k\) in the subtree rooted at \(y\).
    - Recursively delete \(k'\), and replace \(k\) by \(k'\) in \(x\).
    - We can find \(k'\) and delete it in a singuel downward pass.
  - If \(y\) has fewer than \(t\) keys
    - then, symmetrically, examine the child \(z\) that follows \(k\) in node \(x\). If \(z\) has at least \(t\) keys, then find the successor \(k'\) of \(k\) in the subtree rooted at \(z\).
    - Recursively delete \(k'\), and replace \(k\) by \(k'\) in \(x\).
    - We can find \(k'\) and delete it in a single downward pass.
  - Otherwise
    - if both \(y\) and \(z\) have only \(t-1\) keys, merge \(k\) and all of \(z\) into \(y\), so that \(x\) loses both \(k\) and the pointer to \(z\), and \(y\) now contains \(2t-1\) keys
    - Then free \(z\) and recursively delete \(k\) from \(y\).
3. If key \(k\) is not present in internal node \(x\). Determine the root \(x.c_i\) of the appropriate subtree that must contain \(k\), if \(k\) is in the tree at all. If \(x.c_i\) has only \(t-1\) keys execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least \(t\) keys. Then finish by recursing on the appropriate child of \(x\).
  - 3a) If \(x.c_i\) has only \(t-1\) keys but has an immediate sibling with at least \(t\) keys, give \(x.c_i\) an extra key by movind a key from \(x\) down into \(x.c_i\), moving key from \(x.c_i\)’s immediate left or right sibling up into \(x\), and moving the appropriate child pointer from the sibling into \(x.c_i\).
  - 3b) If \(x.c_i\) and both of \(x.c_i\)’s immediate siblings have \(t-1\) keys, merge \(x.c_i\) with one sibling, which involves moving a key from \(x\) down into the new merged node to become the median key for that node.
This procedure involves only \(O(h\)) disk operations for a B-tree of height \(h\)
- Only \(O(1)\) calls to DISK-READ and DISK-WRITE are made
CPU time: \(O(th) = O(t \log_t n)\)

B-Trees

Introduction

Introduction

Introduction

Introduction

Introduction

Introduction

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Data Structures on Secondary Storage

Definition of B-trees

Definition of B-trees

The Height of a B-tree

Basic Operations on B-trees

Basic Operations on B-trees

Basic Operations on B-trees

Basic Operations on B-trees

Basic Operations on B-trees

Basic Operations on B-trees

Deleting a Key from a B-tree