DATA MANAGEMENT FOR DATA ANALYTICS
Book Database system concepts, 6th edition – INTRODUCTION
Silberschatz, Korth and Sudarshan 2011
Lectures Pre-recorded Database systems are used to replace database applications
Instructions AUD6 during the lecture timeslots that were built directly on top of file systems, as this had a
The tutorial focusses on the previous subject number of issues.
Data redundancy and inconsistency
40% Homework assignments weeks 1-6 Usually you would work with multiple sources and file
formats, and there can be duplication of information
60% On campus exam
across files
Difficulty in accessing data
We are allowed a 2-sided A4 cheat sheet during the exam. You need a new program to carry out each task
Lack of data isolation
e.g. if you have an issue with your parsers, you can no
SUBJECTS + READINGS + ASSIGNMENTS longer access your data
Integrity problems
Integrity constraints become buried in code as
opposed to being stated and managed explicitly. It is
Basic E-R model CH1 + CH7 difficult to add new constraints or change existing
Basic RA CH2 + CH6.1 ones
Week 1 Atomicity of updates
Failures may leave databases partially updated
HW1 Sunday night
Concurrent access by multiple users
Uncontrolled concurrency can lead to inconsistencies
Advanced E-R model CH7 Security problems
Advanced RA CH6.1 It’s difficult to provide access to just a part of the data
Week 2 Database managements provides a solution for these issues
HW2 Sunday night
Two classes of languages:
Advanced E-R model CH7 Procedural User specifies what data is required and
how to get it. This is algebraic
Basic SQL CH3
Week 3 Declarative User specifies what data is required, not
how to get it. Based on calculus and logic
HW3 Sunday night
Data manipulation Entering, updating, deleting and
Constraints: FDs CH8 + additional material retrieving information from a database.
Advanced SQL CH3 + parts of CH4 & 5 Querying logical schemas
Week 4 Database design Given requirements for an information
HW4 Sunday night system, how to design the database?
How do we transform the design into an
actual implementation?
Constraints: FDs CH8
Advanced SQL CH3 + parts of CH4 & 5
Object system The “real world” that you try to model
Week 5
Information system The representation of the real world.
HW5 Sunday night This is an approximation.
Decomposition CH8 + appendix B Data modelling Which information? The structure,
based on FDs relationships, constraints, semantics
Week 6 Datalog Parts of appendix C Process modelling How is the information used/
manipulated/shared/created?
HW6 Sunday night
Data independence Insulation from changes in the way
the data is structured and stored
The questions in the homework assignments are
representative of possible exam questions. Schema The logical structure of the database
Physical data independence is the ability to
modify the physical schema without changing
the logical schema.
Instance The actual content of the database at a particular
point in time
, ENTITY-RELATIONSHIP MODELS ADVANCED E-R MODELS
BASIC E-R MODELS CARDINALITY CONSTRAINTS
E-R models They model relational databases and represent Cardinality constraints express the number of entities to
entities and their relationships. which another entity can be associated via a relationship set:
This is a conceptual schema. One-to-one: e.g. each person only has one birth
Sets Collections of data that do not have duplicates certificate, and each birth certificate can only be
associated with one person. Each entity is mapped to
E-R models connect entities and relationships with lines. It is at most one entity in the other entity set.
not possible to directly connect two entities with a line, they
can only be connected through relationships. One-to-many: e.g. you can have many passports, but
each passport only relates to one person
ENTITIES Many-to-one: e.g. passports and people switch sides
Many-to-many: e.g. friends to friends relationships
Entity A “thing” or “object” in the enterprise that is
distinguishable from other objects. → signifies “one”, − signifies “many”
Entity set A set of entities of the same type that share the
same properties (e.g. a set of persons)
Entities have attributes: the data that you want to store.
These are descriptive properties that all members of an Each passport is associated with at most one person.
entity set have. Each attribute has a domain, which is the set Each person can be associated with multiple passports.
of permitted values per attribute.
An example of an entity in a E-R diagram, with different
types of attributes: PARTICIPATION CONSTRAINTS
Entity set name Participation constraints provide information about whether
all entities in the set participate in at least one relationship.
Primary key → Primary key
Total participation: Every entity in the set participates
Address → Composite attribute (vs. simple)
in at least one relationship in the relationship set.
Street → Component attribute
This is indicated by a double line (=).
Street name → Component attribute
e.g. each student is related to at least one instructor
Home number → Component attribute through the advisor relationship
City → Component attribute
Partial participation: Some entities may not participate
{ phone_number } → Multivalued attribute (a set of values)
in any relationship in the relationship set (default).
date_of_birth → Simple attribute
e.g. instructors don’t need to advise any students,
Age () → Derived attribute (from date_of_birth) there are instructors that don’t advise any students.
In ternary relationships (between 3 entity sets), at most one
RELATIONSHIPS arrow may be used to indicate a cardinality constraint.
Any non-binary relationship can be represented using
Relationship An association between several entities binary relationships, by replacing the relationship set with a
Relationship set A relation among 𝑛 ≥ 2 entity sets. new entity set.
Each set must have a unique name.
{(𝑒1 , … , 𝑒𝑛 ) ∣ 𝑒1 ∈ 𝐸1 , … , 𝑒𝑛 ∈ 𝐸𝑛 } DESIGN CHOICES
Where (𝑒1 , … , 𝑒𝑛 ) is a relationship
𝐸𝑖 is an entity set and 𝑒𝑖 is an entity Avoid nulls and redundancies.
The role an entity has in the relationship can be indicated by
The degree of a relationship set is the number of entity sets labelling the lines/arrows that connect diamonds and
that participate in a relationship set. Relationship sets that rectangles. This is optional and explains how
involve two entity sets are binary. the entities interact via the relationship set.
Relationships can also have attributes. These are then A relationship attribute can also be added as an entity set
connected to the relationship by a dashed line. attribute if the relationship is many-to-one and
the entity set in question has total participation.
KEYS Attribute (A) as entity set attribute or as entity
set: It can be added as entity set attribute if the
Super key A subset of 1+ attributes whose values relationship would otherwise be one-to-one
uniquely identify each entity and the other entity set has total participation.
Candidate key A minimal super key.
No subset of this key is a super key. Use of a relationship set or entity set: If a relationship
There can be multiple candidate keys between two entities can occur more than once, a new
Primary key The chosen identifier for an entity set. entity set should be added in between. Otherwise a
These are underlined in E-R diagrams. relationship set suffices.
The combination of primary keys of the participating entity
sets forms the super key of the relationship set.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller NienkeUr. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.89. You're not tied to anything after your purchase.