DB C6 Relational Database Design
First Normal Form, 1NF
RECALL
Domain is atomic if its elements are considered to be indivisible units.
dealing with
non-atomic:- Composite attributes : a number of attributes
- Multi-value attribute : multi-fields / separate table / single field
- Complex data type : a number of attributes
Drawbacks of
non-atomicstrategy:Complicate storage
Encourage redundant storage of data
Complicated to query
A relational schema is in first normal form,1NF if the domains of all attributes of are atomic.
- For the
relational database, it’s required that all relations are in1NF. - Atomicity is actually a property of how the elements of the domain are used.
Pitfalls in Relational Database Design
Relational database design requires that we find a “good” collection of relation schemas.
A bad design may lead to:
- Redundant storage: Wastes space / Easy to result in inconsistency.
- insert / delete / update anomalies : complicated / many
nullwhich is difficult to handle. - inability to represent certain information.
Two design methods
- Top-down
- Button-up: Universal relation (泛关系) decomposition good database schema
Functional Dependencies
Let be a relation schema, and be attributes, i.e
The functional dependency holds on if and only if for any legal relations , whenever any two tuples and of agree on the attributes , they also agree on the attributes , i.e
We say is functionally dependent on , functionally determines .
-
Functional dependencyis a kind ofintegrity constraints, expressing the relationship of values on specific attributes. -
Functional dependencycan be used to judge schema normalization and to suggest refinements.
Functional dependency vs. key
-
A
functional dependencyis a generalization of the notion of akey. -
is a
superkeyfor the relation schema if and only if . -
is a
candidate keyfor if and only if . -
Functional dependenciesallow us to express constraints that cannot be expressed using keys.
Usage
-
Test
relationsto see if they are legal under a given set offunctional dependencies.If a relation is legal under a set of functional dependencies, we say that satisfies .
-
Specify constraints () on the set of legal relations — schema.
We say that holds on if all
legalrelations onsatisfythe set of functional dependencies .
Trivial and Non-Trivial
- $ a\rightarrow b$ is trvial if ,
- otherwise, is non-trivial
Closure
Closure of a Set of Functional Dependencies
Given a set of functional dependencies , the set of all functional dependencies logically implied by is the closure of , denoted by .
Computing
1 | |
The maximum number of possible Functional Dependencies (FDs) is , for attributes.
Armstrong’s Axioms
- reflexivity, 自反律: If ,then
- augmentation, 增补律: If ,then
- transitivity, 传递律: If ,then
- The
reflexivityistrivial. - Based on
augmentation: If ,then ; Byreflexivity, ; By transitivity, . - These rules are
- Sound, 保真的: generate only
functional dependenciesthat actually hold. - Complete, 完备的: generate all
functional dependenciesthat hold.
- Sound, 保真的: generate only
Additional Armstrong’s Axioms
- union,合并律: If , then
- decomposition, 分解律: If , then .
- pseudo transitivity, 伪传递律: If , then .
Closure of Attribute Sets
Given a set of attributes , the closure of under , denoted by , is the set of attributes that are functionally determined by under .
- is
superkey$ \Leftrightarrow a \rightarrow R \Leftrightarrow R \sube a^+$ - is a
candidate keyfor .
Computing :
1 | |
Uses of Attribute Set Closure
Testing for a
superkey:Testing
functional dependencies:a simple and cheap and very useful test.
Computing the
closureof ,- For each , we find .
- For each , we output a
functional dependency. - All form .
Canonical Cover
DBMS should always check to ensure not violate any
Functional Dependency (FD)in . may be simplified!
Intuitively, a canonical cover of , denoted by , is a minimal set of FDs equivalent to :
- Having no redundant
FDsand no redundant parts ofFDs; - Each left side is unique.
Computing
-
Sets of
Functional Dependency (FD)may have redundant dependencies that can be inferred from the others. -
Parts of a
functional dependencyon left side may be redundant. -
Parts of a
functional dependencyon right side may be redundant .Extraneous Attributes, 无关属性
Consider in .
- Attribute is extraneous in , if and logically implies .
- Attribute is extraneous in , if and logically implies .
Decomposition
Requirements
-
All attributes of an original schema must appear in the decomposition
i.e -
Lossless-join decomposition (无损连接分解),
i.efor all possible relations on schemaA decomposition of into and is
lossless-joinif and only if at least one of the following dependencies are held in : -
dependency preservation (依赖保持): But may not be satisfied.
Boyce-Codd Normal Form, BCNF
A relation schema is in BCNF, with respect to a set of functional dependencies, if for all functional dependencies in of the form , where and , at least one of the following holds:
- is
trival
Any relation schema with two attributes is in BCNF.
Testing
Testing for BCNF is to check if a non-trivial dependency causes a violation of BCNF.
- Compute
- Verify that if
Simplified Test
If none of the dependencies in causes a violation of BCNF, then none of the dependencies in will cause a violation of BCNF either.
However, using only to test for BCNF may be incorrect when testing a relation in a decomposition of .
Computing
1 | |
Finally, every sub-schema is in BCNF, and the decomposition is lossless-join.
Third Normal Form, 3NF
A relation schema is in third normal form (3NF) if for all in , at least one of the following conditions holds:
- is
trivial - .
- Each attribute in is contained in a
candidate keyfor . (即是主属性, 若, 则A = b是主属性).
- each attribute may be in a different candidate key.
- 国内其他教材关于3NF的定义: 不存在非主属性对码的部分依赖和传递依赖。即, 当为非主属性时, 必须是码; 但当为主属性时, 则无限制。
- There is always a
lossless-join,dependency-preservingdecomposition into 3NF.
Testing
We just need to check only FDs in , need not check all FDs in .
-
Use
attribute closureto check for each dependency , to see if is asuperkey. -
If is not a
superkey, we have to verify if each attribute in is contained in acandidate keyof .- This test is rather more expensive, since it involve finding all candidate keys.
- Testing for 3NF has been shown to be
NP-hard. - Decomposition into
3NFcan be done inpolynomial time.
Computing
1 | |