DB C2 Relational Model

A relational database consists of a collection of tables, each of which is assigned a unique name.

Relational Model

The relational model is very simple and elegant.

A relational database: a collection of one or more relations based on the relational model.

Major advantages:

straightforward data representation

the ease with which complex queries can be expressed.

SQL is the most widely used language for creating, manipulating, and querying relational database.

c.f relationship and relation

A relation is a table with rows and columns; the mathematical concept, referred to as a table.

A relationship is an association among several entities.

Entity set and relationship set Relation

real world table, tuple, row

machine world

Entity set and relationship set	Relation
real world	table, tuple, row
	machine world

Structure

Basic Structure

Formally given sets $D_1, D_2, …, D_n (D_i = a_{ij} | j=1…k)$ , a $relation$ $ r$ is a subset of $D_1 \times D_2 \times … \times D_n$ / Cartesian product $D_i$ .

A relation is a set of n-tuples ( $a_{1j}, a_{2j}, …, a_{nj}$ ), where each $a_{ij} \in D_i , i = 1,2,\cdots,n$ .

Attribute Types

Each attribute of a relation has a name.

domain of the attribute: The set of allowed values for each attribute.

Attribute values are (normally) required to be atomic i.e indivisible.

e.g multivalued/composite attribute values are not atomic.

The special value null

null is a member of every domain.
null value causes complications in the definition of many operations.

Relation

relation schema

Describes the structure of the relation. Assume $A_1,A_2,\cdots,A_n$ are attributes, formally express a relation schema

R = (A_1,A_2,\cdots,A_n)

$r(R)$ is a relation on the relation schema R.

relation instance

Corresponds to the snapshot of the data in the relation at a given instant in time. The current values / relation instance of a relation are specified by a table.

An element t of r is a tuple, represented by a row in a table. then t[name] denotes the value of t on the name attribute.

variable	relation
variable type	relation schema
variable value	relation instance

Properties

The order of tuples is irrelevant.
No duplicated tuples in a relation.
Attribute values are atomic.

Key

Key is a subset of a relation, $K \sube R$ .

superkey (超码)

superkey: a set of one or more attributes that the values for K are sufficient to identify a unique tuple of each possible relation r(R) .

Superkey is not distinct.

e.g both {ID} and {ID, name} are superkeys.

candidate key (候选码)

candidate key: The minimal superkey.

primary key (主码)

primary key: a candidate key chosen by the database designer as the principal means of identifying tuples within a relation.

Primary key is usually marked by underline.
A key is a property of the entire relation, rather than
of the individual tuples.
Any two individual tuples in the relation are prohibited from having the same value on the key attributes at the same time. The designation of a key represents a constraint in the real-world enterprise being modeled. Thus, primary keys are also referred to as primary key constraints.

Foreign Key (外码)

A foreign-key constraint from attribute(s) $A$ of relation $r_1$ to the primary key $B$ of relation $r_2$ states that on any database instance, the value of $A$ for each tuple in $r_1$ must also be the value of $B$ for some tuple in $r_2$ .

Attribute set $A$ is called a foreign key from $r_1$ , the referencing relation of the foreign-key constraint, referencing $r_2$ , the referenced relation.

Primary key and foreign key are integrated constraints.

Schema Diagrams

A database schema, along with primary key and foreign-key constraints, can be depicted by schema diagrams.

Relational Query Languages

query language: a language in which a user requests information from the database. Usually on a level higher than that of a standard programming language.

Query languages can be categorized as imperative, functional, or declarative.

imperative query language

The user instructs the system to perform a specific sequence of operations on the database to compute the desired result.
Usually have a notion of state variables.

functional query language

The computation is expressed as the evaluation of functions that may operate on data in the database or on the results of other functions.
Functions are side-effect free.
Functions do not update the program state.

declarative query language

User describes the desired information without giving a specific sequence of steps or function calls .
The desired information is typically described using some form of mathematical logic.

It is the database system’s job to figure out how to obtain the desired information.

Pure languages

Relational Algebra — the basis of SQL
Tuple Relational Calculus (元组关系演算)
Domain Relational Calculus — (域关系演算) QBE

Query languages used in practice, such as the SQL query language, include elements of the imperative, functional, and declarative approaches.

Fundamental Relational-Algebra Operations

The operators take one or two relations as inputs, and return a new relation as a result.

Select

\sigma_\rho(r) = \{t | t \in r \wedge \rho(t)\}

selection predicate $\rho$ is a formula in propositional calculus consisting of termslike

<attribute> op <attribute> or <constant>

connected by $\and,\or,\not$ , where op is relational operator in $=,<,>,\leq,\geq,\neq$ .

The selection conditions need to aim at the attribute values of the same tuple, when we conduct section operation.

Project

\Pi_{A_1,A_2,\cdots,A_k}(r)

$A_i$ is attribute name and $r$ is a relation name. The result is defined as the relation of k columns obtained by erasing the columns that are not listed.

NOTEDuplicate rows removed from result, since relations are sets.

Union

r\cup s = \{t|t \in r \vee t \in s\}

NOTEMust be taken between compatible relations:

r and s must have the same arity .
The attribute domains must be compatible.

Set difference

r- s = \{t|t \in r \wedge t \notin s\}

NOTEMust be taken between compatible relations.

Cartesian product

r\times s = \{\left\{tq\right\}|t \in r\wedge q \in s\}

NOTEWe always assume that attributes of $r(R)$ and $s(S)$ are disjoint, i.e. $R\cap S = \phi$ .

Under not disjoint circumstances, then renaming for attributes must be used.

Rename

\rho_X(E)

returns $E$ under the name $X$ . And

\rho_{X(A_1,A_2,\cdots,A_n)}(E)

returns $E$ under the name $X$ , with all attributes renamed $A_1,A_2,\cdots,A_n$ . Renaming allows

to name, and to refer to the results of relational-algebra expressions procedurally.
to refer to a relation by more than one name.

Banking Example

Relations

relation	schema
branch	(branchName, branchCity, assets)
customer	(customerName, customer-street, customerCity)
account	(accountNumber, branchName, balance)
loan	(loanNumber, branchName, amount)
depositor	(customerName, accountNumber)
borrower	(customerName, loanNumber)

Find all loans of over $1200.
$\sigma_{amount > 1200}(loan)$
Find the loan number for each loan of an amount greater than $1200.
$\Pi_{loanNumber}(\sigma_{amount>1200}(loan))$
Find the names of all customers who have a loan, or an account, or both, from the bank.
$\Pi_{customerName}(borrower)\cup\Pi_{customerName}(depositor)$
Find the names of all customers who at least have a loan and an account at bank.
$\Pi_{customerName}(borrower)\cap\Pi_{customerName}(depositor)$
Find the names of all customers who have a loan at the Perryridge branch.
$\Pi_{customerName}(\sigma_{borrower.loanNumber = loan.loanNumber}(borrower\times (\sigma_{branchName = 'Perryridge'}(loan))))$
Find the names of all customers who have loans at the Perryridge branch but do not have an account at any branch of the bank.
$\Pi_{customerName}(\sigma_{borrower.loanNumber = loan.loanNumber}(borrower\times (\sigma_{branchName = 'Perryridge'}(loan)))) - \Pi_{customerName}(depositor)$
Find the largest account balance.(Self-comparison)
$\Pi_{balance}(account) – \Pi_{account.balance}(\sigma_{account.balance<d.balance}(account \times \rho_d(account)))$

Additional Relational-Algebra Operations

Additional Operations do not add any power to the relational algebra, but simplify common queries.

Set intersection

r \cap s = \{t | t \in r and t \in s\}

NOTE

Must be taken between compatible relations.
$r \cap s = r - (r - s)$

Natural join

r \Join s = = \Pi_{r.A_{n_1}, r.A_{n_2},\cdots,r.A_{n_k},s.A_{m_1}, \cdots,s.A_{m_j},}(\sigma_{r.A_{l_1} = s.A_{r_1}\and \cdots r.A_{l_p} = s.A_{r_p}} (r x s))

Let r and s be relations on schemas R and S respectively. Then
$r\Join s$ is a relation on schema $R\cup S$ obtained as follows:

Consider each pair of tuples $t_r$ from r and $t_s$ from s.
If $t_r$ and $t_s$ have the same value on each of the attributes in $R \cap S$ , add a tuple t to the result s.t. t has the same value as $t_r$ on r and $t_s$ on s.

Theta Join

r \Join_{\theta} s = \sigma_\theta(r\times s)

Division

r \div s

Let r and s be relations on schemas R and S respectively, where $R = (A_1,\cdots,A_m,B_1,\cdots,B_n)$ and $S = (B_1,\cdots,B_n)$ . Then $r \div s$ is a relation on the schema $R - S = (A_1,\cdots,A_m)$ .

\begin{align} r \div s &= \left\{t|t \in \Pi_{R-S}(r) \wedge \forall u\in s,tu \in r \right\} \\ &= \Pi_{R-S}(r) – \Pi_{R-S}((\Pi_{R-S}(r)\times s) – \Pi_{R-S,S}(r)) \end{align}

Note

$\Pi_{R-S}(r)$ encloses the result of $r\div s$
The union of the tuple(s) t of $r \div s$ and all the tuples in s is covered by r ,i.e $q = r \div s$ is the largest relation satisfying $q\times s\sube r$ .

Assignment

The assignment operation $\leftarrow$ provides a convenient way to express complex queries.

Note Assignment must always be made to a temporary relation variable.

Extended Relational-Algebra Operations

Generalized Projection

Generalized Projection extends the projection operation by allowing arithmetic functions to be used in the projection list.

\Pi_{F_1,F_2,\cdots,F_n}(E)

$E$ : relational-algebra expression
$F_i$ : arithmetic expressions involving constants and attributes in the schema of $E$ .

Aggregate Functions

Aggregation function takes a collection of values and returns a single value as a result.

avg: average value
min/max: minimum/maximum value
sum: sum of values
count: number of values

_{G_1, G_2,\cdots, G_n}\large g_{F_1(A_1), F_2(A_2),\cdots , F_n(A_n)}(E)

$E$ : relational-algebra expression
$G_1, G_2,\cdots, G_n$ : a list of attributes on which to group (can be empty)
$F_i$ : an aggregate function to apply on $A_i$ grouped by $G_i$ .
$A_i$ : an attribute name.

Note Result of aggregation does not have a name.

rename operation;
permit renaming as part of aggregate operation For convenience.

Modification of the Database

ref P108-114

The content of the database may be modified using Deletion, Insertion and Updating. All these operations are expressed using the assignment operator.

Deletion

A delete request is expressed in the same way as a query, except the selected tuples are removed from the database instead of displaying tuples.

Only whole tuples rather values on only particular attributes can be deleted.
by relational-algebra,

r \leftarrow r - E

by SQL, $\begin{align*} &\rm delete\ from\ \it r\\&\rm where\ \it P; \end{align*}$

Insertion

To insert data into a relation, we either Specify a tuple to be inserted or Write a query whose result is a set of tuples to be inserted.

by relational-algebra,
$r \leftarrow r \cup E$
by SQL,
$\begin{align*} \rm insert\ &\rm into\ \it r \\ &\rm values ('CS-437', 'Database Systems', 'Comp. Sci.', 4); \end{align*}$

Updating

A update is to change a value in a tuple without changing all values in the tuple.

by relational-algebra, using the generalized projection operator,

$r \leftarrow \Pi_{F_1,\cdots,F_i}(r)$

$F_i$ :
1. $i^{th}$ attribute of r, if the $i^{th}$ attribute is not updated
2. an expression involving only constants and the attributes of r, to give the attributes new values, if the attribute is to be updated.
by SQL,
$\begin{align*} &\rm update\ \it r \\ &\rm set\ \it salary= salary * 1.05; \end{align*}$

Note

注意update的顺序，不当的顺序可能会造成重复更新以致错误。

Database system

#Database

DB C2 Relational Model

http://example.com/2023/03/06/DB-2/

Author

Tekhne Chen

Posted on

March 6, 2023

Licensed under

DS C1 Trees01 Previous

CO C3 Arithmetic Next