DB C6 Relational Database Design

First Normal Form, 1NF

RECALL

Domain is atomic if its elements are considered to be indivisible units.

  • dealing with non-atomic:

    • Composite attributes : a number of attributes
    • Multi-value attribute : multi-fields / separate table / single field
    • Complex data type : a number of attributes
  • Drawbacks of non-atomic strategy:

    • Complicate storage

    • Encourage redundant storage of data

    • Complicated to query

A relational schema RR is in first normal form,1NF if the domains of all attributes of RR are atomic.

  • For the relational database, it’s required that all relations are in 1NF.
  • Atomicity is actually a property of how the elements of the domain are used.

Pitfalls in Relational Database Design

Relational database design requires that we find a “good” collection of relation schemas.

A bad design may lead to:

  • Redundant storage: Wastes space / Easy to result in inconsistency.
  • insert / delete / update anomalies : complicated / many null which is difficult to handle.
  • inability to represent certain information.

Two design methods

  • Top-down
  • Button-up: Universal relation (泛关系) \Rightarrow decomposition \Rightarrow good database schema

Functional Dependencies

Let RR be a relation schema, α\alpha and β\beta be attributes, i.e α,βR\alpha,\beta \sube R

The functional dependency αβ\alpha \rightarrow \beta holds on RR if and only if for any legal relations r(R)r(R), whenever any two tuples t1t_1 and t2t_2 of rr agree on the attributes α\alpha, they also agree on the attributes β\beta, i.e

t1,t2r,t1[α]=t2[α]t1[β]=t2[β]\forall t_1,t_2 \in r, t_1[\alpha] = t_2[\alpha] \rightarrow t_1[\beta] = t_2[\beta]

We say β\beta is functionally dependent on α\alpha, α\alpha functionally determines β\beta.

  • Functional dependency is a kind of integrity constraints, expressing the relationship of values on specific attributes.

  • Functional dependency can be used to judge schema normalization and to suggest refinements.

Functional dependency vs. key

  • A functional dependency is a generalization of the notion of a key.

  • KK is a superkey for the relation schema RR if and only if KRK \rightarrow R .

  • KK is a candidate key for RR if and only if KR,¬αK,αRK\rightarrow R ,\lnot\exist\alpha \sube K,\alpha \rightarrow R.

  • Functional dependencies allow us to express constraints that cannot be expressed using keys.

Usage

  • Test relations to see if they are legal under a given set of functional dependencies FF.

    If a relation rr is legal under a set FF of functional dependencies, we say that rr satisfies FF.

  • Specify constraints (FF) on the set of legal relations — schema.

    We say that FF holds on RR if all legal relations rr on RR satisfy the set of functional dependencies FF.

Trivial and Non-Trivial

  • $ a\rightarrow b$ is trvial if bab \sube a,
  • otherwise, is non-trivial

Closure

Closure of a Set of Functional Dependencies

Given a set of functional dependencies FF, the set of all functional dependencies logically implied by FF is the closure of FF, denoted by F+F^+ .

Computing
1
2
3
4
5
6
7
8
9
10
F+ = F 
Repeat
For each functional dependency f in F+
Apply reflexivity and augmentation rules on f
Add the resulting functional dependencies to F+
For each pair of functional dependencies f1and f2 in F+
If f1 and f2 can be combined using transitivity
Then add the resulting functional dependency to F+
Until F+ does not change any further

The maximum number of possible Functional Dependencies (FDs) is 2n×2n2^n \times 2^n, for nn attributes.

Armstrong’s Axioms
  • reflexivity, 自反律: If bab \sube a,then aba \rightarrow b
  • augmentation, 增补律: If aba \rightarrow b,then gagbga \rightarrow gb
  • transitivity, 传递律: If ab,bga \rightarrow b,b\rightarrow g,then aga \rightarrow g
  • The reflexivity is trivial.
  • Based on augmentation: If aba \rightarrow b,then gagbga \rightarrow gb; By reflexivity, gbbgb \rightarrow b; By transitivity, gabga \rightarrow b.
  • These rules are
    • Sound, 保真的: generate only functional dependencies that actually hold.
    • Complete, 完备的: generate all functional dependencies that hold.
Additional Armstrong’s Axioms
  • union,合并律: If ab,aca \rightarrow b,a\rightarrow c, then abca\rightarrow bc
  • decomposition, 分解律: If abga\rightarrow bg, then ab,aga \rightarrow b,a \rightarrow g.
  • pseudo transitivity, 伪传递律: If ab,bgda\rightarrow b,bg\rightarrow d, then agdag \rightarrow d.

Closure of Attribute Sets

Given a set of attributes aa, the closure of aa under FF, denoted by a+a^+, is the set of attributes that are functionally determined by aa under FF.

  • abba+a \rightarrow b \Leftrightarrow b \sube a^+
  • aa is superkey $ \Leftrightarrow a \rightarrow R \Leftrightarrow R \sube a^+$
  • aa is a candidate key for RR Ra+,ba,R⊈b+\Leftrightarrow R \sube a^+ ,\forall b \sube a,R \not\sube b^+.

Computing a+a^+:

1
2
3
4
5
6
7
result := a; 
while (changes to result) do
for each fd(b,g) in F do begin
If (all b in result) then
result := union(result,g)
end;
a+ := result

Uses of Attribute Set Closure

  • Testing for a superkey: aRRa+a\rightarrow R \Leftrightarrow R \sube a^+

  • Testing functional dependencies: abba+a \rightarrow b \Leftrightarrow b \sube a^+

    a simple and cheap and very useful test.

  • Computing the closure of FF ,F+F^+

    1. For each gRg \sube R, we find g+g^+.
    2. For each Sg+S \sube g^+, we output a functional dependency gSg \rightarrow S.
    3. All gSg \rightarrow S form F+F^+.

Canonical Cover

DBMS should always check to ensure not violate any Functional Dependency (FD) in FF. FF may be simplified!

Intuitively, a canonical cover of FF, denoted by FcF_c, is a minimal set of FDs equivalent to FF:

Fc====implyLogicallyFF_c \overset{Logically}{\underset{imply}{====}}F

  • Having no redundant FDs and no redundant parts of FDs;
  • Each left side is unique.

Computing

  1. Sets of Functional Dependency (FD) may have redundant dependencies that can be inferred from the others.

  2. Parts of a functional dependency on left side may be redundant.

  3. Parts of a functional dependency on right side may be redundant .

    Extraneous Attributes, 无关属性

    Consider aba \rightarrow b in FF.

    • Attribute AA is extraneous in aa, if AaA \in a and FF logically implies F=(F{ab}){(aA)b}F' = (F – \left\{a \rightarrow b\right\}) \cup \left\{(a – A) \rightarrow b\right\}.
    • Attribute BB is extraneous in bb, if BbB \in b and FF logically implies F=(F{ab}){a(bB)}F' = (F – \left\{a \rightarrow b\right\}) \cup \left\{a \rightarrow (b - B)\right\}.

Decomposition

Requirements

  • All attributes of an original schema RR must appear in the decomposition R1,R2R_1,R_2 i.e

    R=R1R2R = R_1 \cup R_2

  • Lossless-join decomposition (无损连接分解), i.e for all possible relations rr on schema RR

    r=ΠR1(r)ΠR2(r)r = \Pi_{R_1}(r) \Join \Pi_{R_2}(r)

    A decomposition of RR into R1R_1 and R2R_2 is lossless-join if and only if at least one of the following dependencies are held in F+F^+:

    • {R1R2}R1\{R_1 \cap R_2\} \rightarrow R_1
    • {R1R2}R2\{R_1 \cap R_2\} \rightarrow R_2
  • dependency preservation (依赖保持): But may not be satisfied.

Boyce-Codd Normal Form, BCNF

A relation schema RR is in BCNF, with respect to a set FF of functional dependencies, if for all functional dependencies in F+F^+ of the form aba \rightarrow b , where aRa \sube R and bRb \sube R , at least one of the following holds:

  • aba \rightarrow b is trival
  • Ra+R \sube a^+

Any relation schema with two attributes is in BCNF.

Testing

Testing for BCNF is to check if a non-trivial dependency aba \rightarrow b causes a violation of BCNF.

  • Compute a+a^+
  • Verify that if Ra+R \sube a^+

Simplified Test

If none of the dependencies in FF causes a violation of BCNF, then none of the dependencies in F+F^+ will cause a violation of BCNF either.

However, using only FF to test for BCNF may be incorrect when testing a relation RiR_i in a decomposition of RR.

Computing

1
2
3
4
5
6
7
8
9
10
11
12
result := {R}; 
done := false;
compute F+; //shown before
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let a -> b be a nontrivial functional
dependency that holds on Ri such
that a -> Ri is not in F+, and a AND b = empty;
result := (result – Ri) UNION (a, b) UNION (Ri – b);
end
else done := true;

Finally, every sub-schema is in BCNF, and the decomposition is lossless-join.

Third Normal Form, 3NF

A relation schema RR is in third normal form (3NF) if for all aba \rightarrow b in F+F^+, at least one of the following conditions holds:

  • aba \rightarrow b is trivial
  • Ra+R \sube a^+.
  • Each attribute AA in bab - a is contained in a candidate key for RR. (即AbaA \in b - a是主属性, 若ab=ϕa\cap b = \phi, 则A = b是主属性).
  • each attribute may be in a different candidate key.
  • 国内其他教材关于3NF的定义: 不存在非主属性对码的部分依赖和传递依赖。即, 当bb为非主属性时, aa必须是码; 但当bb为主属性时, 则aa无限制。
  • There is always a lossless-join, dependency-preserving decomposition into 3NF.

Testing

We just need to check only FDs in FF, need not check all FDs in F+F^+.

  • Use attribute closure to check for each dependency aba \rightarrow b, to see if aa is a superkey.

  • If aa is not a superkey, we have to verify if each attribute in bb is contained in a candidate key of RR.

    • This test is rather more expensive, since it involve finding all candidate keys.
    • Testing for 3NF has been shown to be NP-hard.
    • Decomposition into 3NF can be done in polynomial time.

Computing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Let Fc be a canonical cover for F; 
i := 0;
for each functional dependency a -> b in Fc do
if none of the schemas Rj, 1 <= j <= i, contains ab
then begin
i := i + 1;
Ri := ( )
end
if none of the schemas Rj, 1 <= j <= i, contains a candidate key for R then
begin
i := i + 1;
Ri := any candidate key for R;
end
return (R1, R2, ..., Ri)

BCNF vs. 3NF

Multivalued Dependencies, MVD

Fourth Normal Form


DB C6 Relational Database Design
http://example.com/2023/04/17/DB-06/
Author
Tekhne Chen
Posted on
April 17, 2023
Licensed under