DB C6 Relational Database Design
First Normal Form, 1NF
RECALL
Domain is atomic
if its elements are considered to be indivisible units.
dealing with
non-atomic
:- Composite attributes : a number of attributes
- Multi-value attribute : multi-fields / separate table / single field
- Complex data type : a number of attributes
Drawbacks of
non-atomic
strategy:Complicate storage
Encourage redundant storage of data
Complicated to query
A relational schema is in first normal form,1NF if the domains of all attributes of are atomic
.
- For the
relational database
, it’s required that all relations are in1NF
. - Atomicity is actually a property of how the elements of the domain are used.
Pitfalls in Relational Database Design
Relational database design requires that we find a “good” collection of relation schemas.
A bad design may lead to:
- Redundant storage: Wastes space / Easy to result in inconsistency.
- insert / delete / update anomalies : complicated / many
null
which is difficult to handle. - inability to represent certain information.
Two design methods
- Top-down
- Button-up: Universal relation (泛关系) decomposition good database schema
Functional Dependencies
Let be a relation schema, and be attributes, i.e
The functional dependency holds on if and only if for any legal relations , whenever any two tuples and of agree on the attributes , they also agree on the attributes , i.e
We say is functionally dependent on , functionally determines .
-
Functional dependency
is a kind ofintegrity constraints
, expressing the relationship of values on specific attributes. -
Functional dependency
can be used to judge schema normalization and to suggest refinements.
Functional dependency vs. key
-
A
functional dependency
is a generalization of the notion of akey
. -
is a
superkey
for the relation schema if and only if . -
is a
candidate key
for if and only if . -
Functional dependencies
allow us to express constraints that cannot be expressed using keys.
Usage
-
Test
relations
to see if they are legal under a given set offunctional dependencies
.If a relation is legal under a set of functional dependencies, we say that satisfies .
-
Specify constraints () on the set of legal relations — schema.
We say that holds on if all
legal
relations onsatisfy
the set of functional dependencies .
Trivial and Non-Trivial
- $ a\rightarrow b$ is trvial if ,
- otherwise, is non-trivial
Closure
Closure of a Set of Functional Dependencies
Given a set of functional dependencies , the set of all functional dependencies logically implied by is the closure of , denoted by .
Computing
1 |
|
The maximum number of possible Functional Dependencies
(FDs
) is , for attributes.
Armstrong’s Axioms
- reflexivity, 自反律: If ,then
- augmentation, 增补律: If ,then
- transitivity, 传递律: If ,then
- The
reflexivity
istrivial
. - Based on
augmentation
: If ,then ; Byreflexivity
, ; By transitivity, . - These rules are
- Sound, 保真的: generate only
functional dependencies
that actually hold. - Complete, 完备的: generate all
functional dependencies
that hold.
- Sound, 保真的: generate only
Additional Armstrong’s Axioms
- union,合并律: If , then
- decomposition, 分解律: If , then .
- pseudo transitivity, 伪传递律: If , then .
Closure of Attribute Sets
Given a set of attributes , the closure of under , denoted by , is the set of attributes that are functionally determined by under .
- is
superkey
$ \Leftrightarrow a \rightarrow R \Leftrightarrow R \sube a^+$ - is a
candidate key
for .
Computing :
1 |
|
Uses of Attribute Set Closure
Testing for a
superkey
:Testing
functional dependencies
:a simple and cheap and very useful test.
Computing the
closure
of ,- For each , we find .
- For each , we output a
functional dependency
. - All form .
Canonical Cover
DBMS should always check to ensure not violate any
Functional Dependency (FD)
in . may be simplified!
Intuitively, a canonical cover of , denoted by , is a minimal set of FDs
equivalent to :
- Having no redundant
FDs
and no redundant parts ofFDs
; - Each left side is unique.
Computing
-
Sets of
Functional Dependency (FD)
may have redundant dependencies that can be inferred from the others. -
Parts of a
functional dependency
on left side may be redundant. -
Parts of a
functional dependency
on right side may be redundant .Extraneous Attributes, 无关属性
Consider in .
- Attribute is extraneous in , if and logically implies .
- Attribute is extraneous in , if and logically implies .
Decomposition
Requirements
-
All attributes of an original schema must appear in the decomposition
i.e
-
Lossless-join decomposition (无损连接分解),
i.e
for all possible relations on schemaA decomposition of into and is
lossless-join
if and only if at least one of the following dependencies are held in : -
dependency preservation (依赖保持): But may not be satisfied.
Boyce-Codd Normal Form, BCNF
A relation schema is in BCNF, with respect to a set of functional dependencies
, if for all functional dependencies in of the form , where and , at least one of the following holds:
- is
trival
Any relation schema with two attributes is in BCNF.
Testing
Testing for BCNF
is to check if a non-trivial
dependency causes a violation of BCNF
.
- Compute
- Verify that if
Simplified Test
If none of the dependencies in causes a violation of BCNF, then none of the dependencies in will cause a violation of BCNF either.
However, using only to test for BCNF
may be incorrect when testing a relation in a decomposition of .
Computing
1 |
|
Finally, every sub-schema is in BCNF, and the decomposition is lossless-join.
Third Normal Form, 3NF
A relation schema is in third normal form (3NF) if for all in , at least one of the following conditions holds:
- is
trivial
- .
- Each attribute in is contained in a
candidate key
for . (即是主属性, 若, 则A = b是主属性).
- each attribute may be in a different candidate key.
- 国内其他教材关于3NF的定义: 不存在非主属性对码的部分依赖和传递依赖。即, 当为非主属性时, 必须是码; 但当为主属性时, 则无限制。
- There is always a
lossless-join
,dependency-preserving
decomposition into 3NF.
Testing
We just need to check only FDs
in , need not check all FDs
in .
-
Use
attribute closure
to check for each dependency , to see if is asuperkey
. -
If is not a
superkey
, we have to verify if each attribute in is contained in acandidate key
of .- This test is rather more expensive, since it involve finding all candidate keys.
- Testing for 3NF has been shown to be
NP-hard
. - Decomposition into
3NF
can be done inpolynomial time
.
Computing
1 |
|