DS C1 Trees01
Preliminaries
Definition
Recursive def of a tree:
A tree is a collection of nodes and edges. The collection could be empty; otherwise the tree consists of a distinguished node called root, and more than nonempty (sub)trees , each of whose roots are connected from by a directed edge.
- parents/children: is the parent of and is a
child
of if there’s an edge from to . - siblings: and are
siblings
if their roots are same. - leaves: Node that has no children.
- path (from to ) and length:
path
is defined as a sequence of such that is parent of for . Thelength
of thepath
is . - depth/height (of ):
- depth/height (of tree ):
- ancestors/descendant : is a ancestor of and is a descendant of if there’s a path from to .
NOTE
The height of an empty tree is defined -1 usually.
Propositions
- A tree of nodes has edges.
- There’s exactly path from root to each node.
Implementations
1 |
|
Traversals
All traversals’ time complexity is .
Preorder
1 |
|
Postorder
1 |
|
Binary Trees
1 |
|
Propositions
Inorder
1 |
|
Examples :Expression Tree
The leaves of an expression tree are operands, and the other nodes contain operators.
- infix: traverse the tree in inoder.
- prefix: traverse the tree in preorder.
- post: traverse the tree in postorder.
Binary Search Tree
Properties
The inorder traversal of BST is ascending.
Operations
-
Make empty
1
2
3
4
5
6
7
8SearchTree MakeEmpty(SearchTree T){
if(T != null){
MakeEmpty(T -> Left);
MakeEmpty(T -> Right);
free(T);
}
return Null;
} -
Find/FindMin/FindMax
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26Position Find(ElementType X, SearchTree T){
if(T == null)
return null;
if(X < T -> Element)
return Find(X,T -> Left);
else if(X > T -> Element)
return Find(X,T -> Right);
else
return T;
}
Position FindMin(SearchTree T){ //leftmost
if(T == null)
return null;
if(T -> Left)
return FindMin(T -> Left);
else
return T;
}
Position FindMax(SearchTree T){//rightmost
if(T == null)
return null;
if(T -> Right)
return FindMax(T -> Right);
else
return T;
} -
Insert
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15SearchTree Insert(ElementType X,SearchTree T){
if(T == null){
T = malloc(sizeof(struct TreeNode));
if(T == null) FatalError("OOS!")
else{
T -> Element = X;
T -> Left = T -> Right = null;
}
}
else if(X < T -> Element)
T -> Left = Insert(X,T -> Left);
else if(X > T -> Element)
T -> Right = Insert(X,T -> Right);
return T;
} -
Delete
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24SearchTree Delete(ElementType X,SearchTree T){
Position TempCell;
if(T == null){
Error("Element not found");
}
else if(X < T -> Element)
T -> Left = Delete(X,T -> Left);
else if(X > T -> Element)
T -> Right = Delete(X,T -> Right);
else if(T -> Left && T -> Right){
TempCell = FindMin(T -> Right);
T -> Element = TempCell -> Element;
T -> Right = Delete(TempCell -> Element,T -> Right);
}
else{
TempCell = T;
if(T -> Left == null)
T = T -> Right;
else
T = T -> Left;
free(TempCell);
}
return T;
}
Average-Case Analysis
AVL Tree
AVL tree is a BST with a balance condition.
height balanced
- An empty binary tree is
height balanced
. - If T is a nonempty binary tree with and as its left and right subtrees(could be empty), then T is height balanced iff
- and are
height balanced
, and - , where and are the heights of and .
- and are
The balance factor (BF)
In an AVL tree, .
Insertion and rotation
After an insertion, the nodes on the path from the insertion to the root may have their balance altered.
Hence after an insertion,
-
follow the path up to the root and keep updating the balancing info until we met the first unbalanced node ,
i.e
. -
Address and fix the violation for different cases.
BF Child Child’s Subtree Solution Note 2 Left Left single rotation LL 2 Left Right double rotation LR -2 Right Left double rotation RL -2 Right Right single rotation RR -
Repeat 1 - 2 until we gets to root.
Tree rotation: an operation on a binary tree that changes the structure without interfering with the order of the elements whose time complexity is .
NOTE
After a rotation, the side of the rotation increases its height by 1 while the side opposite the rotation decreases its height by 1 similarly.
-
LL Rotation: single right rotation is enough and no more need to rotate.
-
RR Rotation: single left rotation is enough and no more need to rotate.
-
LR Rotation: take as the new root of the whole subtree by a left rotation and then a right rotation. This may cause new unbanlance.
-
RL Rotation: take as the new root of the whole subtree by a right rotation and then a left rotation. This may cause new unbanlance.
height estimation
Let be the minimum number of nodes in a height balanced tree of height h
. Then the tree must look like
with initial conditions .
Hence
Splay Tree
A splay tree is to control the time Any M consecutive tree operations starting from an empty tree taken at most time.
Although this guarantee does not preclude the possibility that any single operation might take time.
But there are no bad input sequences.
Generally, when a sequence of operations has total worst-case running time of , we say that the amortized running time is .
Rotations
We will still rotate bottom up along the access path.
Let be a nonroot node on the access path at which we are rotating.
- The parent of is the root of the tree: merely rotate and the root .
- has both a parent () and a grandparent ():
- zig-zag case: is a right child and is a left child (or vice versa). Exactly same as
LR rotation
orRL rotation
of AVL tree.
- zig-zig case: and are either both left children or both right children. take as the new root of the whole subtree by a rotation between and and then a rotation between and .
- zig-zag case: is a right child and is a left child (or vice versa). Exactly same as
Splaying not only moves the accessed node to the root, but also roughly halves the depth of most nodes on the path.
Amortized Analysis
ref
Introduction to Algorithms, P451-462.
In an amortized analysis, we average the time required to perform a sequence of data-structure operations over all the operations performed.
Amortized analysis
differs fromaverage-case
analysis in that probability is not involved;amortized analysis
guarantees the average performance of each operation in the worst case.
NOTE
The amortized cost
applies to each operation, even when there are several types of operations in the sequence.
Aggregate analysis
For all , a sequence of operations takes worst-case time in total. The amortized cost
per operation is .
Exp: MultiStack
- PUSH(S, x) pushes object x onto stack S.
- POP(S) pops the top of stack S and returns the popped object.
err
Calling POP on an empty stack.- MULTIPOP(S, k) removes the k top objects of stack S.
note
popping the entire stack if the stack contains fewer than k objects; leaving the stack unchanged if k is not positive.
Then analyze a sequence of n
PUSH,POP,and MULTIPOP
operations on an initially empty stack. It’s easy to know that the worst case of MULTIPOP
won’t exist continuously.
We can pop each object from the stack at most once for each time we have pushed it onto the stack. Therefore, the number of times that POP
(including calls within MULTIPOP
) can be called on a nonempty stack is at most the number of PUSH
operations. Hence
Exp: Incrementing a binary counter
A binary number x
that is stored in the counter has its lowest-order bit in A[0] and its highest-order bit in A[k - 1].
INCREMENT(A) adds 1 (modulo 2k) to the value in the counter.
1
2
3
4
5
6i = 0;
while i < A.length and A[i] == 1
A[i] = 0;
i = i + 1;
if i < A.length
A[i] = 1;
In the worst case in which array A contains all 1s, a single execution of INCREMENT takes time . On an initially zero counter, flips times in a sequence of n INCREMENT operations.
Accounting method
Assign differing charges to different operations with some operations charged more or less than they actually cost.
amortized cost : the amount we charge the operation.
credit : the difference to specific objects in the data structure.
NOTE
Credit can help pay for later operations whose amortized cost
is less than their actual cost
.
NOTE
All operations have the same amortized cost
in aggregate analysis, while they may differ from each other in accounting method.
And we require
for all sequences of n operations. i.e
The total credit associated with the data structure must be nonnegative at all times. Or the total amortized cost would not be an upper bound on the total actual cost!
Exp: MultiStack
Op (designed) PUSH 1 2 1 POP 1 0 -1 MULTIPOP k 0 -k
The sum of the credit equals the number of objects in the stack, which is nonnegative. Hence
Exp: Incrementing a binary counter
Charge an amortized cost
of 2 to set a bit to 1, 1 to pay for the actual setting of the bit, and another to be used later when the bit is flipped back to 0.
And charge nothing to reset a bit to 0.
The sum of the credit equals the number of 1s in the counter, which is nonnegative.
Potential method
Perform n operations starting with an initial data structure . For each :
-
: actual cost of the operation
-
: the data structure that results after applying the operation to data structure
-
: a potential function maps each to a real number , which is the potential associated with .
-
: the amortized cost of the operation.
The amortized cost is
Hence the only requirements for potential method is
for all sequences of n operations.
A good potential function should always assume its minimum at the start of the sequence.
Exp: MultiStack
Define the potential function on a stack to be the number of objects in the stack.
Hence,
Op | |||
---|---|---|---|
PUSH | 1 | 1 | 2 |
POP | 1 | -1 | 0 |
MULTIPOP | k | -k | 0 |
Exp: Incrementing a binary counter
Define the potential function on a stack to be the number of 1s in the counter.
Hence, assume 1s are set to 0 in a single execution,
Op | |||
---|---|---|---|
INCREMENT | 1 + | 1 - | 2 |
Exp: Splay Tree
The potential function:
functions | NOTEs |
---|---|
Almost every node’s height changes after a rotation! | |
The difference is too large, causing a loose bound! | |
Great ! |
lemma
If , then
PF:
For Zig:
For Zig-zag:
For Zig-zig:
Hence the amortized time of the insertion of k rotations is
The rotation of zig
happens once at most .