1.
Introduction
Btrees
were originally designed to work on direct access secondary storage (Magnetic
disk). In a computer system the memory capacity is consists of two parts: primary
memory (uses memory chips) and secondary storage (based on magnetic disk). When
dealing with large datasets, it is often impossible to maintain a whole dataset
in the primary memory. Instead, a small part of the dataset is maintained in
the primary memory and the remaining data is read from a secondary storage 1.
Btree
is a data structure used for sorting huge datasets for fast retrieval from a
disk. A typical Btree has a singledigit height, even with millions of entries
2. Which means
that few disk accesses are needed to find the place where data is stored in the
tree. At most two accesses are required to search for any node. “From a
practical point of view, Btrees, therefore, guarantee an access time of less
than 10 ms even for extremely large datasets.” —Dr. Rudolf Bayer,
inventor of the Btree
2. Description
Btrees
save time by using nodes with many branches (children). Unlike binary tree, in
which each node has only two children. When a node has many children a record
can be found easily by passing through fewer nodes than if there are two children
per node. See the example in Figure 1.
The example
in Figure 1 shows that the red node in a binary tree has depth of three while
the corresponding node in the Btree has depth of one. Clearly, Btrees locates
desired records faster, assuming that all system parameters are identical. The
difference in depth between binary tree and Btree is greater in a practical
database than in the example in Figure 1. Real world Btrees height depends on
the number of records in the database.
The
tradeoff is that the decision process at each node in Btree is more
complicated compared with a binary tree. A complex program is required to
execute Btree’s operations, but this program runs fast because it is stored in
RAM.
3. Definition
A
Btree of order m (the maximum number of children for each node) is a
selfbalancing search tree.
3.1 Node Structure
See
a Btree node’s structure in Figure 2.
Figure 2: Node structure
–
K : is a search key. Search operation depends on the key value.
According to its value we can find the record that we are searching for. All
keys in a node are in ascending order: k1 < k2 ....... < kn.

DP : is a data pointer, which can take us to the whole record depending
on the search key value that is associated with.

P : is a pointer, which might be connected to a subtree. p1 is
connected to a subtree that contains values which are all less that k1 value.
p2 is connected to another subtree that contains values which are all bigger
than k1 value and less than k2 value.
3.2 Btree Properties
A
Btree satisfies the following properties:
· Every nonleaf node has at most m children.
· Every nonleaf node (except root) has at least m?2 children.
· A nonleaf node with m children contains m1 keys.
· The root has at least two children if it is not a leaf node.
· All leaves appear in the same level.
Figure
3 shows a Btree of order four. Each node contains up to three keys, and
internal nodes have up to four children 3.
Figure 3: A Btree of order
four.
The height
of an nkey btree T of height h with a minimum degree t greater than or equal
to 2,
, For n
The
worst case height is O(log n). Since the branches of a Btree can be large
compared to other balanced trees, the base of the algorithm tends to be large. But
the number of visited nodes during a search tends to be smaller than required
by other trees. Although this is does not affect the worst case height, Btree
tends to has smaller heights than other trees 4.
4.
Basic Operations
Conventions:
·
The root of a Btree is in the main memory (it is
not required to perform DiskRead operation on the root).
·
Any node passed as a parameter must have had a DiskRead
operation performed on them.
·
Top down algorithm, starts at the root of the tree.
4.1 Searching on a Btree
Searching
a Btree is similar to searching a binary tree but instead of making two way
branching decision at a node, must make multiway branching decision depending
on the number of a node's children. To search for a key in a Btree, a linear
search must be performed on the values of a node. After finding the value which is greater than or equal to the desired
value, the child pointer to the immediate left of that value is followed. If
all values are less than the desired value, the rightmost child pointer is
followed. The search is terminated as soon as the desired node is found 5.
4.1.1
BTREESEARCH
pseudocode
Inputs: is a pointer
to the root node of a subtree.
is a key to be searched in the subtree.
BTREESEARCH(x, k)
1 i 1
2 while i nx and k keyix
3 do i i + 1
4 if i nx and k = keyix
5 then return (x,
i)
6 if leaf x
7 then return
NIL
8 else DISKREAD(cix)
9 return
BTREESEARCH(cix, k)
Lines 13 : Using a linear
search procedure, find the smallest i
such
that k ? keyix, or else they set i to nx
+ 1.
Lines 45 : Check if the
key is discovered, then return it.
Lines 69 : Either end the
search unsuccessfully if x is
a leaf or recurs to search the appropriate subtree of x, after performing necessary DISKREAD operation on
that child 6.
4.1.2
BTREESEARCH
Running Time
The number of disk pages accessed by BTREESEARCH is ?(h) = (logt
n), where h is the height of the Btree and n is the number of
keys in the Btree. The while loop time in lines
23 for each node is O(t). The running time is O(th)
= O(t logt n) 6.
4.1.3
BTREESEARCH
Examples
Consider the Btree in Figure 2.
·
Example 1 : Search for key = 7.
By
applying BTREESEARCH:
Starts at
the root (17), is key = 17? Is key > 17? No, go to left subtree.
Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.
Is key = 9? Is key > 9? No, go to subtree left of 9.
Next node (7,8), is key = 7? Yes, found.
·
Example 2 : Search for key = 11.
By applying BTREESEARCH:
Starts at the root (17), is key = 17? Is
key > 17? No, go to left subtree.
Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.
Is key = 9? Is key > 9? Yes, go to the right subtree (there is no
more keys in the node to check).
Next node (12), is key = 12? Is key >12? No, the left subtree is a
leaf. Which means that 11 is not in the tree.
4.2 Creating an empty Btree
BTREECREATE operation builds an empty Btree by first creating a new empty root
node and then call BTREEINSERT to
add new keys. BTREECREATE and
BTREEINSERT operations are
both using ALLOCATENODE which
allocates one disk page to be used as a new node in O(1) time. When a node is
created by ALLOCATENODE it
does not require DISKREAD,
because there is no useful information stored on the disk for that node yet.
4.2.1
BTREECREATE pseudocode
BTREECREATE(T)
1 x ALLOCATENODE()
2 leafx TRUE
3 nx 0
4 DISKWRITE(x)
5
rootT x
4.2.2
BTREECREATE Running Time
The running
time is O(1).
4.3 Splitting a node in a Btree
When
a node becomes full it is necessary to perform split operation. BTREESPLITCHILD moves the
median key of node y up into its parent x (it has to be nonfull node), where y is
the ith child of x. All keys in y right
of the median key are moved to the new node. The keys left of the median key
remain in the original node y. The
new node z becomes the right child of the median key that was moved to
its parent x. The original
node y becomes the left child of the median
key that was moved to its parent x 5.
4.3.1
BTREESPLITCHILD pseudocode
Inputs: is a nonfull
internal node.
is an index.
is
the ith child of x and is the node being split.
Node y originally has 2t – 1 children but is
reduced to t – 1 children by this operation.
BTREESPLITCHILD(x,i,y)
1 z ALLOCATENODE()
2 leafz leafy
3 nz t – 1
4 for j 1 to t – 1
5 do keyjz
keyj+ty
6 if not leaf y
7 then for j 1 to t
8 do cjz
cj+ty
9 ny t – 1
10 for j nx + 1 downto i + 1
11 do cj+1x
cjx
12 ci+1x
z
13 for j nx downto i
14 do keyj+1x
keyjx
15 keyix
keyty
16 nx nx + 1
17 DISKWRITE(y)
18 DISKWRITE(z)
19 DISKWRITE(x)
Lines 18 : Create a new node z and give it the larger t –
1 keys and corresponding t children of y.
Line 9 : Adjusts the key count for y.
Lines 1016 : Insert z as a child of x. To separate y from z,
move the median key from y up to x, and adjust the
key count for x.
Lines 1719 : write out all modified disk page 6.
4.3.2
BTREESPLITCHILD Running Time
The running time is ?(t).
4.3.3
BTREESPLITCHILD Example
Insert key (j) to the following Btree of order 5 in Figure 4.
j > F,
Insert key (j) to the right subtree. See Figure 5.
Figure 5: The Btree after
inserting ‘j’ ( BTreeSplitChild Example )
It is a full node, By applying BTREESPLITCHILD:
The median key which is (j) moves up to the parent node because the
parent not is nonfull. All keys on the left of the median key (G,H) remain on
the original node. All keys on the right of the median key (K,M) move to the
new node. See Figure 6.
Figure 6: The Btree after
Splitting a node ( BTreeSplitChild Example )
4.4
Inserting
into a Btree
Inserting a key into a Btree T of height h is done
in a single pass down the tree. The BTREEINSERT procedure uses
BTREESPLITCHILD to guarantee
that the recursion never occurs to a full node 6.
4.4.1
BTREEINSERT pseudocode
Input: is a key to
be inserted.
BTREEINSERT(T,k)
1 r rootT
2 if nr
= 2t – 1
3 then s ALLOCATENODE()
4 rootT
s
5 leafs
FALSE
6 ns
0
7 c1s
r
8
BTREESPLITCHILD(s,1,r)
9 BTREEINSERTNONFULL(s,k)
10 else
BTREEINSERTNONFULL(r,k)
Lines 1: Root node r is created.
Lines 29: Handle the case where the root node r is full, the root is split and a new node s (having two children)
becomes the root. The height of the Btree is increased at the top. The
procedure finishes by calling BTREEINSERTNONFULL to
perform the insertion of key k in the tree rooted at the nonfull root
node 6.
Lines 10: Recurses and
guarantees that the node to which it recurs is not full by calling BTREESPLITCHILD.
The recursive procedure BTREEINSERTNONFULL
inserts key k into node x, assumed to be nonfull. The
operation of BTREEINSERT and the recursive operation of BTREEINSERTNONFULL guarantee
that this assumption is true 6.
4.4.2
BTREEINSERTNONFULL pseudocode
BTREEINSERTNONFULL(x,k)
1 i nx
2 if leafx
3 then while i
1 and k < keyix
4 do keyi+1
x keyix
5 i i  1
6 keyi+1x
k
7 nx
nx + 1
8 DISKWRITE(x)
9 else while i 1 and k < keyix
10 do i
i  1
11 i i + 1
12 DISKREAD(cix)
13 if ncix
= 2t  1
14 then
BTREESPLITCHILD(x,i,cix)
15 if k
> keyix
16 then
i i + 1
17
BTREEINSERTNONFULL(cix,k)
Lines 38: Handle the case where x is a leaf node by inserting key k
into node x. If x is not a leaf node, then insert k into
the appropriate leaf node in the subtree rooted at internal node x,
in this case go to lines 911.
Lines 911: Determine the child of node x to which the
recursion descends.
Lines 13: Detects whether the recursion would descend to a full child, in
this case go to line 14.
Lines 14: Uses BTREESPLITCHILD to
split that child into two nonfull children.
Lines 1516: Determines which one of the two children is the correct is one to
descend to.
Line 17: Then recurses
to insert k into the appropriate subtree 6.
4.4.3
BTREEINSERT Running Time
The running time is O(t1ogt n).
4.4.4
BTREEINSERT Example
Insert the following keys (70, 90, 25) to the Btree
of order 6 in Figure 7.
Figure 7: a Btree of order
6 ( BTreeInsert Example )
By applying BTREEINSERT :
First compare the key (70) with all key
values, it is bigger so it is inserted after key (50).
Then compare the key (90) with all key values, it is
bigger so it is inserted after key (70). See Figure 8.
Figure 8: The Btree after
inserting ‘70,90’ ( BTreeInsert Example )
The node is full, to insert the key (25)
a split procedure must be applied first. Take the median key (50) and promote
it as the new root node and perform split operation on the original node. All
keys on the left of the median key which are (10) and (30) stay on the original
node and this node become the left child of the median key (50) which is the
new root node. All keys on the right of the median key which are (70) and (90)
move to new node and become the right child of the median key (50) which is in the
new root node.
Now insert the key (25) by first
comparing it with the root node key (50) which is bigger, so go to the left
subtree. The left subtree has two keys (10,30), first compare the key (25)
with the first key (30) which is bigger so the key (25) is inserted before the
key (30). See Figure 9.
Figure 9: The Btree after
inserting ’25’ ( BTreeInsert Example )
4.5 Deleting from a Btree
There are three possible cases for
deletion in a Btree. Let k be the key to be deleted and x is the
node containing the key k 7.
The three cases are:
4.5.1
Case?
If the key
is a leaf node then simply remove the key to be deleted.
Key k is in node x and x is a
leaf, simply delete k from x.
·
Case? Example
Figure 10: a Btree of order
4 ( BTreeDeletekey Case? Example )
Delete the key (6) from the Btree of
order 4 in Figure 10.
Figure 11: The Btree after deleting ‘6’ (
BTreeDeletekey Case? Example )
See the Btree after deleting key (6) in Figure 11.
4.5.2
Case??
If key k is in node x and x is
an internal node, there are three cases to consider:
4.5.2.1 Case??a
If the child y that precedes k
in node x has at least t keys (more than the minimum), then find
the predecessor key k’ in the subtree rooted at y. Recursively
delete k’ and replace k with k’ in x 7.
4.5.2.2 Case??b
If the child node z that follows k
in node x has at least t keys, find the successor k’ and
delete then replace as in Case??a. Note that finding k’ and deleting it can be
performed in a single downward pass 7.
· Case??b Example
Figure 12: a Btree of order
4 ( BTreeDeletekey Case?b Example )
Delete the
key (13) from the Btree of order 4 in Figure 12.
See the
Btree after deleting key (13) in Figure 13.
Figure 13: The Btree after deleting ’13’ (
BTreeDeletekey Case?b Example )
4.5.2.3 Case??c
If both y and z have only t?1
(minimum number) keys, merge k and all of z into y, so
that both k and the pointer to z are removed from x. y
now contains 2t ? 1 keys, and subsequently k is deleted 7.
· Case??c Example
Figure 14: a Btree of order
5 ( BTreeDeletekey Case?c Example )
Delete the
key (7) from the Btree of order 5 in Figure 14.
See the
Btree after deleting key (7) in Figure 15.
Figure 15: The Btree after
deleting ‘7’ ( BTreeDeletekey Case?c Example )
4.5.3
Case???
If key k is not in an internal
node x, determine the root of the appropriate subtree that must contain
k. If the root has only t ? 1 keys, execute either of the
following two cases to descend to a node containing at least t keys. Then
recurs to the appropriate child of x 7.
4.5.3.1 Case???a
If the root has only t?1 keys and has a
sibling with t keys, give the root an extra key by moving a key from node x
to the root. Moving a key from the roots immediate left or right sibling up
into node x, and moving the appropriate child from the sibling to node x
7.
· Case???a Example
Figure 16: a Btree of order 6 ( BTreeDeletekey
Case??a Example )
Delete the
key (2) from the Btree of order 6 in Figure 16.
Figure 17: The Btree after
deleting ‘2’ ( BTreeDeletekey Case??a Example )
See the
Btree after deleting key (2) in Figure 17.
4.5.3.2 Case???b
If the root and all of its siblings have t?1
keys, merge the root with one sibling. Then move a key down from node x
into the new merged node to become the median key for that node.
· Case???b Example
Figure 18: a Btree of order
6 ( BTreeDeletekey Case??b Example )
Delete the key (4) from the Btree of
order 6 in Figure 18.
Figure 19: The Btree after
deleting ‘4’ ( BTreeDeletekey Case??b Example )
See the
Btree after deleting key (2) in Figure 19 and Figure 20.
Figure 20: The Btree after
deleting ‘4’ ( BTreeDeletekey Case??b Example ) – 2
4.5.4
BTREEDELETEKEY
pseudocode
BTreeDeleteKey(x, k)
1 if not leafx then
2 y ? PrecedingChild(x)
3 z ? SuccessorChild(x)
4 if ny > t
? 1 then
5 k’ ?
FindPredecessorKey(k, x)
6 MoveKey(k’, y, x)
7 MoveKey(k, x, z)
8 BTreeDeleteKey(k,
z)
9 else if nz
> t ? 1 then
10 k’ ?
FindSuccessorKey(k, x)
11 MoveKey(k’, z, x)
12 MoveKey(k, x, y)
13 BTreeDeleteKey(k,
y)
14 else
15 MoveKey(k, x, y)
16 MergeNodes(y, z)
17 BTreeDeleteKey(k,
y)
18 else (leaf node)
19 y ? PrecedingChild(x)
20 z ? SuccessorChild(x)
21 w ? root(x)
22 v ? RootKey(x)
23 if nx > t
? 1 then RemoveKey(k, x)
24 else if ny
> t ? 1 then
25 k’ ?
FindPredecessorKey(w, v)
26 MoveKey(k’, y,w)
27 k’ ?
FindSuccessorKey(w, v)
28 MoveKey(k’,w, x)
29 BTreeDeleteKey(k,
x)
30 else if nw > t ? 1 then
31 k’ ? FindSuccessorKey(w,
v)
32 MoveKey(k’, z,w)
33 k’ ?
FindPredecessorKey(w, v)
34 MoveKey(k’,w, x)
35 BTreeDeleteKey(k,
x)
36 else
37 s ? FindSibling(w)
38 w’ ? root(w)
39 if nw’ = t
? 1 then
40 MergeNodes(w’,w)
41 MergeNodes(w,
s)
42 BTreeDeleteKey(k,
x)
43 else
44 MoveKey(v,w, x)
45 BTreeDeleteKey(k,
x)
PrecedingChild(x): Returns the left child of key x.
MoveKey(k, x, z): Moves key k from node x to
node z.
MergeNodes(y, z): Merges the keys of nodes y and z
into a new node.
FindPredecessorKey(k, x): Returns the key preceding key k
in the child of node x.
RemoveKey(k, x): Deletes key k from node x.
x must be a leaf node.
5.
Branching Factor
Given enough
memory, increasing the branching factor will reduce the height of the Btree and
will make each level of the Btree hold more nodes. Fewer levels means fewer
disk access (if any) and better performance 8.
But if the branching
factor in a Btree is so big, it means that the first view levels cannot be
held in the memory. Which means accessing each level can cause disk swapping and
poor performance 8.
There are two major concerns:
1.
Limited
memory: a Btree parameters are limited by the available memory 8. All search starts
from the top of a Btree, so the first level of nodes need to be in memory. If the
number of nodes fits in main memory, it means that there is no need for disk
storage. But scaling physical memory is an expensive proposition and infeasible
beyond a point 8.
The
solution to this problem is sharding. Sharding a Btree storage enables a
Btree to hold a limited number of keys. Sharding the database along with its
btree indices into different physical machine is one of the most effective
scaling solutions for growing data volumes