DATA MANAGEMENT FOR DATA ANALYTICS
Book Database system concepts, 6th edition – INTRODUCTION
Silberschatz, Korth and Sudarshan 2011
Lectures Pre-recorded Database systems are used to replace database applications
Instructions AUD6 during the lecture timeslots that were built directly on top of file systems, as this had a
The tutorial focusses on the previous subject number of issues.
Data redundancy and inconsistency
40% Homework assignments weeks 1-6 Usually you would work with multiple sources and file
formats, and there can be duplication of information
60% On campus exam
across files
Difficulty in accessing data
We are allowed a 2-sided A4 cheat sheet during the exam. You need a new program to carry out each task
Lack of data isolation
e.g. if you have an issue with your parsers, you can no
SUBJECTS + READINGS + ASSIGNMENTS longer access your data
Integrity problems
Integrity constraints become buried in code as
opposed to being stated and managed explicitly. It is
Basic E-R model CH1 + CH7 difficult to add new constraints or change existing
Basic RA CH2 + CH6.1 ones
Week 1 Atomicity of updates
Failures may leave databases partially updated
HW1 Sunday night
Concurrent access by multiple users
Uncontrolled concurrency can lead to inconsistencies
Advanced E-R model CH7 Security problems
Advanced RA CH6.1 It’s difficult to provide access to just a part of the data
Week 2 Database managements provides a solution for these issues
HW2 Sunday night
Two classes of languages:
Advanced E-R model CH7 Procedural User specifies what data is required and
how to get it. This is algebraic
Basic SQL CH3
Week 3 Declarative User specifies what data is required, not
how to get it. Based on calculus and logic
HW3 Sunday night
Data manipulation Entering, updating, deleting and
Constraints: FDs CH8 + additional material retrieving information from a database.
Advanced SQL CH3 + parts of CH4 & 5 Querying logical schemas
Week 4 Database design Given requirements for an information
HW4 Sunday night system, how to design the database?
How do we transform the design into an
actual implementation?
Constraints: FDs CH8
Advanced SQL CH3 + parts of CH4 & 5
Object system The “real world” that you try to model
Week 5
Information system The representation of the real world.
HW5 Sunday night This is an approximation.
Decomposition CH8 + appendix B Data modelling Which information? The structure,
based on FDs relationships, constraints, semantics
Week 6 Datalog Parts of appendix C Process modelling How is the information used/
manipulated/shared/created?
HW6 Sunday night
Data independence Insulation from changes in the way
the data is structured and stored
The questions in the homework assignments are
representative of possible exam questions. Schema The logical structure of the database
Physical data independence is the ability to
modify the physical schema without changing
the logical schema.
Instance The actual content of the database at a particular
point in time
, ENTITY-RELATIONSHIP MODELS ADVANCED E-R MODELS
BASIC E-R MODELS CARDINALITY CONSTRAINTS
E-R models They model relational databases and represent Cardinality constraints express the number of entities to
entities and their relationships. which another entity can be associated via a relationship set:
This is a conceptual schema. One-to-one: e.g. each person only has one birth
Sets Collections of data that do not have duplicates certificate, and each birth certificate can only be
associated with one person. Each entity is mapped to
E-R models connect entities and relationships with lines. It is at most one entity in the other entity set.
not possible to directly connect two entities with a line, they
can only be connected through relationships. One-to-many: e.g. you can have many passports, but
each passport only relates to one person
ENTITIES Many-to-one: e.g. passports and people switch sides
Many-to-many: e.g. friends to friends relationships
Entity A “thing” or “object” in the enterprise that is
distinguishable from other objects. → signifies “one”, − signifies “many”
Entity set A set of entities of the same type that share the
same properties (e.g. a set of persons)
Entities have attributes: the data that you want to store.
These are descriptive properties that all members of an Each passport is associated with at most one person.
entity set have. Each attribute has a domain, which is the set Each person can be associated with multiple passports.
of permitted values per attribute.
An example of an entity in a E-R diagram, with different
types of attributes: PARTICIPATION CONSTRAINTS
Entity set name Participation constraints provide information about whether
all entities in the set participate in at least one relationship.
Primary key → Primary key
Total participation: Every entity in the set participates
Address → Composite attribute (vs. simple)
in at least one relationship in the relationship set.
Street → Component attribute
This is indicated by a double line (=).
Street name → Component attribute
e.g. each student is related to at least one instructor
Home number → Component attribute through the advisor relationship
City → Component attribute
Partial participation: Some entities may not participate
{ phone_number } → Multivalued attribute (a set of values)
in any relationship in the relationship set (default).
date_of_birth → Simple attribute
e.g. instructors don’t need to advise any students,
Age () → Derived attribute (from date_of_birth) there are instructors that don’t advise any students.
In ternary relationships (between 3 entity sets), at most one
RELATIONSHIPS arrow may be used to indicate a cardinality constraint.
Any non-binary relationship can be represented using
Relationship An association between several entities binary relationships, by replacing the relationship set with a
Relationship set A relation among 𝑛 ≥ 2 entity sets. new entity set.
Each set must have a unique name.
{(𝑒1 , … , 𝑒𝑛 ) ∣ 𝑒1 ∈ 𝐸1 , … , 𝑒𝑛 ∈ 𝐸𝑛 } DESIGN CHOICES
Where (𝑒1 , … , 𝑒𝑛 ) is a relationship
𝐸𝑖 is an entity set and 𝑒𝑖 is an entity Avoid nulls and redundancies.
The role an entity has in the relationship can be indicated by
The degree of a relationship set is the number of entity sets labelling the lines/arrows that connect diamonds and
that participate in a relationship set. Relationship sets that rectangles. This is optional and explains how
involve two entity sets are binary. the entities interact via the relationship set.
Relationships can also have attributes. These are then A relationship attribute can also be added as an entity set
connected to the relationship by a dashed line. attribute if the relationship is many-to-one and
the entity set in question has total participation.
KEYS Attribute (A) as entity set attribute or as entity
set: It can be added as entity set attribute if the
Super key A subset of 1+ attributes whose values relationship would otherwise be one-to-one
uniquely identify each entity and the other entity set has total participation.
Candidate key A minimal super key.
No subset of this key is a super key. Use of a relationship set or entity set: If a relationship
There can be multiple candidate keys between two entities can occur more than once, a new
Primary key The chosen identifier for an entity set. entity set should be added in between. Otherwise a
These are underlined in E-R diagrams. relationship set suffices.
The combination of primary keys of the participating entity
sets forms the super key of the relationship set.
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper NienkeUr. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €5,49. Je zit daarna nergens aan vast.