100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Cheat sheet for Data Management for Data Analytics (JBI050) €3,49   In winkelwagen

Overig

Cheat sheet for Data Management for Data Analytics (JBI050)

 26 keer bekeken  4 keer verkocht

This is a cheat sheet for the course Data Management for Data Analytics provided at the TU/e. All information is summarized on two sides of an A4 paper, color coded based on the subject.

Laatste update van het document: 1 jaar geleden

Voorbeeld 1 van de 2  pagina's

  • 24 maart 2023
  • 18 april 2023
  • 2
  • 2022/2023
  • Overig
  • Onbekend
Alle documenten voor dit vak (2)
avatar-seller
NienkeUr
Closure – Armstrong rules Algorithms for 𝑭+ and 𝜶+ RA Operators

A1. Reflexivity 𝑖𝑓 𝛽 ⊆ 𝛼, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽
𝐹 + := 𝐹 Select from r based on predicate p
A2. Augmentation 𝑖𝑓 𝛼 → 𝛽, 𝑡ℎ𝑒𝑛 𝛾𝛼 → 𝛾𝛽 𝝈𝒑 ሺ𝒓ሻ
A3. Transitivity 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛽 → 𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛾 Repeat
For each fd 𝑓 in 𝐹 +
𝚷𝑨𝒊 ,…,𝑨𝒏 ሺ𝒓ሻ Keep the specified columns
The following rules can be inferred from A1-A3 Apply A1 and A2 on 𝑓 and add resulting fd’s to 𝐹 +
A4. Union 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛼 → 𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽𝛾 For each pair of fd’s 𝑓1 and 𝑓2 𝒓∪𝒔 Union of two tables
A5. Decomposition 𝑖𝑓 𝛼 → 𝛽𝛾, 𝑡ℎ𝑒𝑛 𝛼 → 𝛽 𝑎𝑛𝑑 𝛼 → 𝛾 If A3 can be applied, add resulting fd to 𝐹 +
A6. Pseudotransitivity 𝑖𝑓 𝛼 → 𝛽, 𝑎𝑛𝑑 𝛾𝛽 → 𝛿, 𝑡ℎ𝑒𝑛 𝛼𝛾 → 𝛿 𝒓−𝒔 Set difference of two tables
Until 𝐹 + does not change any further
Connect all tuples from one relation
BCNF – No redundancies 𝒓×𝒔
Result := 𝛼 to all tuples from the other relation
For each FD 𝜶 → 𝜷 ∈ 𝑭+, one of the following holds: While there are still changes to result
𝝆𝒙ሺ𝑨𝟏 ,…,𝑨𝒏ሻሺ𝒓ሻ Rename table(columns)
 𝛼 → 𝛽 is trivial (𝛽 ⊆ 𝛼ሻ For each 𝛽 → 𝛾
If 𝛽 ⊆ result, then result := result ∪ 𝛾
 𝛼 is a superkey for the whole schema 𝑅 ← Assignment

If we have only one table: Canonical cover 𝒓∩𝒔 The intersection of two tables
Check whether the FDs in 𝐹 are a violation of BCNF Attributes are extraneous for 𝑭 = {… , 𝜶 → 𝜷, … } if:
 If 𝑭′ = {… , 𝜶 − {𝑨} → 𝜷, … } ⟺ 𝑭. Natural join. Connect the tables on
If we have a decomposition: 𝒓⋈𝒔 the common columns
Check if 𝛽 is in ሺ𝛼 − {𝐴}ሻ+
𝐹
For every set of attributes 𝛼 ⊆ 𝑅𝑖 , check that 𝛼 + either
includes no attribute of 𝑅𝑖 − 𝛼, or all attributes of 𝑅𝑖 .  If 𝑭′ = {… , 𝜶 → 𝜷 − {𝑩}, … } ⟺ 𝑭 All tuples in r associated with all
𝒓÷𝒔 tuples in s
Check if ሺ𝛼 → 𝐵ሻ is in 𝐹 ′+: 𝛽 ∈ ሺ𝛼ሻ+
𝐹′
If it only has 2 attributes, it is always in BCNF.
Each left side of the FD in 𝑭𝒄 should be unique:
Decompose into (𝛼 → 𝛽, and 𝛼 → 𝛾 ⇒ 𝛼 → 𝛽𝛾) Ordering the results
 𝑅1 = 𝛼 ∪ 𝛽 (a table with just 𝛼 and 𝛽)
 𝑅2 = 𝑅 − ሺ𝛽 − 𝛼ሻ (still has 𝛼but no longer has 𝛽) SELECT operators
SELECT … FROM … Order by A in descending
WHERE … order (ASC is default)
Result := {R} SELECT A ሺΠሻ select the specified columns ORDER BY A DESC
Done := False
DISTINCT / ALL Are duplicates allowed. ALL is default. Order by A in ascending
Compute 𝐹 +
… ORDER BY A
order, output the first n
While (not done) do: * Asterisk Select all attributes LIMIT n results.
If (there is a schema 𝑅𝑖 in result that is not in BCNF)
Arithmetic Operations with + - * /, these can be
Let 𝛼 → 𝛽 be a nontrivial FD that holds on 𝑅𝑖 such
expressions used on constants and attributes
that 𝛼 → 𝑅𝑖 is not in 𝐹 + and 𝛼 ∩ 𝛽 = ∅. Null values
Result := ሺ𝑟𝑒𝑠𝑢𝑙𝑡 – {𝑅𝑖 }ሻ ∪ ሺ{𝑅𝑖 − 𝛽}ሻ ∪ {𝛼 ∪ 𝛽} AS Rename attributes
Else done := True
SELECT B FROM … Select all B that have null
FROM operators WHERE A is NULL values in A
3NF – Preserve dependencies
SQL generally just ignores them
For each FD 𝜶 → 𝜷 ∈ 𝑭+, one of the following holds: FROM 𝑹𝟏 , … , 𝑹𝒏 (×ሻ use the specified relations
 In BCNF Views
AS Rename the tables
 Each attribute B in 𝛽 − 𝛼 is contained in a candidate
key for R
WHERE operators CREATE VIEW V AS The query is stored in the
database and can be used
Check for each non-trivial 𝛼 → 𝛽 if 𝛼 is a superkey. If not, <query> in FROM
verify if each attribute in 𝛽 ∈ a candidate key of 𝑅.
WHERE P (𝜎) where P is True
V1 depends directly on V2 if V2 is used in defining V1
A BETWEEN x AND y where A ≥ x and A≤y
Let 𝐹𝑐 be a canonical cover for 𝐹 V is recursive if it depends on itself
𝑖 := 0 Find patterns using:
For each FD 𝛼 → 𝛽 in 𝐹𝑐 A LIKE “%Main%” % - matches any substring View expansion is basically computing the closure.
If none of the schemas 𝑅𝑗 , 1 ≤ 𝑗 ≤ 𝑖 contains 𝛼 𝛽 _ - matches any character Find any 𝑉𝑖 in query and replace it by the expression that
𝑖 := 𝑖+1 defines 𝑉𝑖 until no more views are present in the query.
𝑅𝑗 :=𝛼 𝛽 Subqueries are nested select-from-where expressions

A IN () Whether values are in the subquery Derived relations
If none of the schemas 𝑅𝑗 1 ≤ 𝑗 ≤ 𝑖 contains a candidate
key for 𝑅
EXISTS () Is the result of the subquery empty
𝑖:= 𝑖+1 WITH max_A() AS ( Define a temporary
𝑅𝑖 := any candidate key for 𝑅 Checks whether the subquery has Select Max(A) from R) table, that is only
UNIQUE () available to the query
duplicates (True if empty)
If some 𝑅𝑗 is a subset for some other 𝑅𝑘 , then remove it SELECT A FROM R, max_A in question.
Return ሺ𝑅1 , … , 𝑅𝑖 ሻ
Set operators Common table expressions: subquery in FROM clause

Check for lossless-ness and dependency preservation Triggers
() UNION () (∪) the union of two sets
The decomposition is lossless if:
() INTERSECT () (∩) the intersection Statements that are automatically executed as a side effect
∀𝑟 𝑟 = Π𝑅1 ሺ𝑟ሻ ⋈ Π𝑅2 ሺ𝑟ሻ of a modification to the database.
() EXCEPT () (−) the set difference  Specifies when: upon which events, under which
At least one of the following FDs is in 𝑭+ :
conditions
𝑅1 ∩ 𝑅2 → 𝑅1 or 𝑅1 ∩ 𝑅2 → 𝑅2
The default is DISTINCT, they automatically remove duplicates  Specifies what: the actions to take

It is dependency preserving if:
Aggregate functions
ሺ𝑭𝟏 ∪ … ∪ 𝑭𝒏 ሻ+ = 𝑭+ Create trigger name
{before or after} {insertion, deletion, or update}
SELECT AVG (A) SELECT SUM (A) on relation
Result = 𝛼
While (changes to result) do SELECT MIN (A) SELECT COUNT ( DISTINCT A) Referencing new row as new
For each 𝑅𝑖 in the decomposition COUNT(*) counts all tuples Referencing old row as old
T=ሺresult ∩ 𝑅𝑖 ሻ+ ∩ 𝑅𝑖 SELECT MAX (A) For each {row or statement} when condition
Result = result ∪ 𝑡 begin
If result contains all attributes in 𝛽, then 𝛼 → 𝛽 is preserved SELECT A, AGG(B) A must be in SELECT clause. …
end
Apply this test on 𝐹, not on 𝐹 +
FROM … Get aggregate values for B per
GROUP BY A value in A.

… GROUP BY A Alternative to WHERE when
HAVING P on agg(B) using grouped data.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper NienkeUr. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €3,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67096 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€3,49  4x  verkocht
  • (0)
  Kopen