Closure – Armstrong rules Algorithms for 𝑭+ and 𝜶+ RA Operators
A1. Reflexivity 𝑖𝑓 𝛽 ⊆ 𝛼, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽
𝐹 + := 𝐹 Select from r based on predicate p
A2. Augmentation 𝑖𝑓 𝛼 → 𝛽, 𝑡ℎ𝑒𝑛 𝛾𝛼 → 𝛾𝛽 𝝈𝒑 ሺ𝒓ሻ
A3. Transitivity 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛽 → 𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛾 Repeat
For each fd 𝑓 in 𝐹 +
𝚷𝑨𝒊 ,…,𝑨𝒏 ሺ𝒓ሻ Keep the specified columns
The following rules can be inferred from A1-A3 Apply A1 and A2 on 𝑓 and add resulting fd’s to 𝐹 +
A4. Union 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛼 → 𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽𝛾 For each pair of fd’s 𝑓1 and 𝑓2 𝒓∪𝒔 Union of two tables
A5. Decomposition 𝑖𝑓 𝛼 → 𝛽𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽 𝑎𝑛𝑑 𝛼 → 𝛾 If A3 can be applied, add resulting fd to 𝐹 +
A6. Pseudotransitivity 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛾𝛽 → 𝛿, 𝑡ℎ𝑒𝑛 𝛼𝛾 → 𝛿 𝒓−𝒔 Set difference of two tables
Until 𝐹 + does not change any further
Connect all tuples from one relation
BCNF – No redundancies 𝒓×𝒔
Result := 𝛼 to all tuples from the other relation
For each FD 𝜶 → 𝜷 ∈ 𝑭+, one of the following holds: While there are still changes to result
𝝆𝒙ሺ𝑨𝟏 ,…,𝑨𝒏ሻሺ𝒓ሻ Rename table(columns)
𝛼 → 𝛽 is trivial (𝛽 ⊆ 𝛼ሻ For each 𝛽 → 𝛾
If 𝛽 ⊆ result, then result := result ∪ 𝛾
𝛼 is a superkey for the whole schema 𝑅 ← Assignment
If we have only one table: Canonical cover 𝒓∩𝒔 The intersection of two tables
Check whether the FDs in 𝐹 are a violation of BCNF Attributes are extraneous for 𝑭 = {… , 𝜶 → 𝜷, … } if:
If 𝑭′ = {… , 𝜶 − {𝑨} → 𝜷, … } ⟺ 𝑭. Natural join. Connect the tables on
If we have a decomposition: 𝒓⋈𝒔 the common columns
Check if 𝛽 is in ሺ𝛼 − {𝐴}ሻ+
𝐹
For every set of attributes 𝛼 ⊆ 𝑅𝑖 , check that 𝛼 + either
includes no attribute of 𝑅𝑖 − 𝛼, or all attributes of 𝑅𝑖 . If 𝑭′ = {… , 𝜶 → 𝜷 − {𝑩}, … } ⟺ 𝑭 All tuples in r associated with all
𝒓÷𝒔 tuples in s
Check if ሺ𝛼 → 𝐵ሻ is in 𝐹 ′+: 𝛽 ∈ ሺ𝛼ሻ+
𝐹′
If it only has 2 attributes, it is always in BCNF.
Each left side of the FD in 𝑭𝒄 should be unique:
Decompose into (𝛼 → 𝛽, and 𝛼 → 𝛾 ⇒ 𝛼 → 𝛽𝛾) Ordering the results
𝑅1 = 𝛼 ∪ 𝛽 (a table with just 𝛼 and 𝛽)
𝑅2 = 𝑅 − ሺ𝛽 − 𝛼ሻ (still has 𝛼but no longer has 𝛽) SELECT operators
SELECT … FROM … Order by A in descending
WHERE … order (ASC is default)
Result := {R} SELECT A ሺΠሻ select the specified columns ORDER BY A DESC
Done := False
DISTINCT / ALL Are duplicates allowed. ALL is default. Order by A in ascending
Compute 𝐹 +
… ORDER BY A
order, output the first n
While (not done) do: * Asterisk Select all attributes LIMIT n results.
If (there is a schema 𝑅𝑖 in result that is not in BCNF)
Arithmetic Operations with + - * /, these can be
Let 𝛼 → 𝛽 be a nontrivial FD that holds on 𝑅𝑖 such
expressions used on constants and attributes
that 𝛼 → 𝑅𝑖 is not in 𝐹 + and 𝛼 ∩ 𝛽 = ∅. Null values
Result := ሺ𝑟𝑒𝑠𝑢𝑙𝑡 – {𝑅𝑖 }ሻ ∪ ሺ{𝑅𝑖 − 𝛽}ሻ ∪ {𝛼 ∪ 𝛽} AS Rename attributes
Else done := True
SELECT B FROM … Select all B that have null
FROM operators WHERE A is NULL values in A
3NF – Preserve dependencies
SQL generally just ignores them
For each FD 𝜶 → 𝜷 ∈ 𝑭+, one of the following holds: FROM 𝑹𝟏 , … , 𝑹𝒏 (×ሻ use the specified relations
In BCNF Views
AS Rename the tables
Each attribute B in 𝛽 − 𝛼 is contained in a candidate
key for R
WHERE operators CREATE VIEW V AS The query is stored in the
database and can be used
Check for each non-trivial 𝛼 → 𝛽 if 𝛼 is a superkey. If not, <query> in FROM
verify if each attribute in 𝛽 ∈ a candidate key of 𝑅.
WHERE P (𝜎) where P is True
V1 depends directly on V2 if V2 is used in defining V1
A BETWEEN x AND y where A ≥ x and A≤y
Let 𝐹𝑐 be a canonical cover for 𝐹 V is recursive if it depends on itself
𝑖 := 0 Find patterns using:
For each FD 𝛼 → 𝛽 in 𝐹𝑐 A LIKE “%Main%” % - matches any substring View expansion is basically computing the closure.
If none of the schemas 𝑅𝑗 , 1 ≤ 𝑗 ≤ 𝑖 contains 𝛼 𝛽 _ - matches any character Find any 𝑉𝑖 in query and replace it by the expression that
𝑖 := 𝑖+1 defines 𝑉𝑖 until no more views are present in the query.
𝑅𝑗 :=𝛼 𝛽 Subqueries are nested select-from-where expressions
A IN () Whether values are in the subquery Derived relations
If none of the schemas 𝑅𝑗 1 ≤ 𝑗 ≤ 𝑖 contains a candidate
key for 𝑅
EXISTS () Is the result of the subquery empty
𝑖:= 𝑖+1 WITH max_A() AS ( Define a temporary
𝑅𝑖 := any candidate key for 𝑅 Checks whether the subquery has Select Max(A) from R) table, that is only
UNIQUE () available to the query
duplicates (True if empty)
If some 𝑅𝑗 is a subset for some other 𝑅𝑘 , then remove it SELECT A FROM R, max_A in question.
Return ሺ𝑅1 , … , 𝑅𝑖 ሻ
Set operators Common table expressions: subquery in FROM clause
Check for lossless-ness and dependency preservation Triggers
() UNION () (∪) the union of two sets
The decomposition is lossless if:
() INTERSECT () (∩) the intersection Statements that are automatically executed as a side effect
∀𝑟 𝑟 = Π𝑅1 ሺ𝑟ሻ ⋈ Π𝑅2 ሺ𝑟ሻ of a modification to the database.
() EXCEPT () (−) the set difference Specifies when: upon which events, under which
At least one of the following FDs is in 𝑭+ :
conditions
𝑅1 ∩ 𝑅2 → 𝑅1 or 𝑅1 ∩ 𝑅2 → 𝑅2
The default is DISTINCT, they automatically remove duplicates Specifies what: the actions to take
It is dependency preserving if:
Aggregate functions
ሺ𝑭𝟏 ∪ … ∪ 𝑭𝒏 ሻ+ = 𝑭+ Create trigger name
{before or after} {insertion, deletion, or update}
SELECT AVG (A) SELECT SUM (A) on relation
Result = 𝛼
While (changes to result) do SELECT MIN (A) SELECT COUNT ( DISTINCT A) Referencing new row as new
For each 𝑅𝑖 in the decomposition COUNT(*) counts all tuples Referencing old row as old
T=ሺresult ∩ 𝑅𝑖 ሻ+ ∩ 𝑅𝑖 SELECT MAX (A) For each {row or statement} when condition
Result = result ∪ 𝑡 begin
If result contains all attributes in 𝛽, then 𝛼 → 𝛽 is preserved SELECT A, AGG(B) A must be in SELECT clause. …
end
Apply this test on 𝐹, not on 𝐹 +
FROM … Get aggregate values for B per
GROUP BY A value in A.
… GROUP BY A Alternative to WHERE when
HAVING P on agg(B) using grouped data.