Samenvatting IDT
Week 1.
We are currently see increasing growth of the data. This is big data.
What are the reasons for DBMS (data-base management-systems)? In the early days: database
applications were built on top of file systems. So, you had a computer, you included all info in files, and
then you had to somehow retrieve information from these files. There were plethora(overload) of
drawbacks:
1. You had data redundancy and inconsistency. For example data from university: you have one file
of the student, and another file of the registration for courses of the student, so if one student
would change information and you only modified one of the files, the other file would be in an
inconsistent state → different information (inconsistency). Redundancy = the same information
in different files, because you need to do that.
2. Difficulty in accessing data: need to write a new program to carry out each new task.
3. Data isolation: one file could follow one format, and another file could follow another format
(for example, how you write dd/yy/mm.
4. Integrity problems: for example in a bank, the balance of an account might be less than zero
(integrity constraint). However, it is hard to impose this constraint through actual file. You can
write whatever you want. It is hard to add new constraints, verify constraints or change existing
ones.
5. Atomicity of updates: example: transfer of funds from one account to another. This should either
complete, or not at all → you can’t stop in the middle. Which means, when you change the data
in these files, when something happens, you have to roll back and unchanged all the data. so if
you had a failure, data could be left in an un-consisting state → only partial updates were carried
out.
6. Concurrent access by multiple users. Both people were reading the balance of a banking system,
and withdrawing at the same time, you should be careful. First one read 100 and wanted to
withdraw 50, and the other wanted to withdraw 70. If one of them wouldn’t lock the file, the
one of them would overwrite the info of the other. This can lead to inconsistencies.
7. Security problems; we want to keep different levels of access to different users. Which is hard
when using different files. Either access is given or not.
DBMS offers solutions to all!
DBMS architecture:
First of all: we have the database. This database follows the same structure for all data. The data that is
in the DB, is there for a purpose → the purpose is what the company is doing. So it is expected from this
data that it has correlations and relations. Also this data is shared by several users. DBMS: it is a
collection of programs over DB so this DBMS offers functionality/methods. These method allows you to
specify the data type, the structure, the constraints in the data. they allow you to build/manipulate data.
and you can also administrate the data → for example manage access rights. After DBMS you have the
applications: currently we have different devices, a laptop, phone, etc. to access the same information
,you have different layouts. The information is the same. So you can have different interfaces. Basically,
you can see the DBMS as a black box interacting between users/applications and the actual database.
Different DBMS examples:
- Airlines: reservation and schedule information. Among the first to use databases in a
geographically distributed manner.
- Banking: customer information, accounts, loans, and transactions.
- University: student information, course registration, grades. Application program examples:
• Add new students, lectures, courses.
• Register students, lectures, courses.
• Assign grades, generate transcripts.
The ultimate goal of DBMS is to separate the data from the application. DBMS provide an interface that
the application programmer must follow. They also allow system administrator to make modifications
without having an impact on the user. The actual end user doesn’t need to see the information in the
inside. Also, the users can change their view of the data without having to worry about how the data is
stored. In order for this to happen, the DBMS is composed by 3 layers.
1. External layer: communication of users
• Analysis of user requests (queries).
• Access control (not all users can access all).
• Answer presentation.
2. Logical layer:
• Optimization of queries.
• Resolving conflicting access, i.e. multiple users.
• Guarantees constant availability even in case of failures.
3. Internal layer:
• Storing the data.
• Software for structuring the data.
• Efficient access methods → keys, indices, etc.
Development process (lifecycle) → you can have different ones, but we are only showing one.
1. Planning: develop a preliminary understanding of the business situation and how information
systems might help solve the problem. This step includes:
• Analyzing current data processing.
, • Analyzing general business functions and needs.
2. Analysis: analyze the business situation. Thoroughly to determine requirements and to structure
those requirements. The output of this analysis is what we call the conceptual schema. This
corresponds to a detailed, technology independent specification of the overall organizational
data structure → it is a description of what the company would like to develop, and it is not
bounded to a particular technology → to generate a conceptual schema we use an entity
relation model:
3. Design: there are two different designs, logical and physical:
• Logical design: representation of the database. You
transform the conceptual schema (output of previous step)
in something that the DBMS can use.
• Physical: the set of specifications that describe how data are
stored in a computer’s secondary memory by a specific
database management system. Keep in mind that you have
different systems. How the data is stored, doesn’t have to
be the same.
4. Implementation: once you have built the database, you include the
data (populate it) then you install the applications and do some
testing. you do some complete documentation and generate
training materials.
5. Maintenance:
• Monitor the operation and usefulness of the system.
• Repair by fixing errors in database and applications.
• Enhance by analyzing the database and applications to
ensure that evolving information requirements are met.
You can have different types of DBMS:
- Traditional → use text and numerical data.
- Multimedia → multimedia data (movies, music, etc.).
- Spatial → geographic and geometric data.
, - Data warehouses.
Relational DBMS are based on the relational data model. A relational model is an approach to managing
data by representing it grouped into relations. RDBMS: a database management system that manages
data as a collection of tables in which all relationships are represented by common values in related
tables. There is a standard language: SQL. This is a language for creating and querying relational
databases. It is simple, expressive, with efficient implementations and used by man commercial systems:
SQlite we will use for practice assignments.
Standard for RDBMS:
- Reduced training costs.
- Cross system communication.
- Etc.
SQL has 2 environments. Catalog: information for included databases. And schema: structure of one
database: tables, views. In addition there are 3 different languages you can use:
1. Data definition language (DDL): commands that define a database, including creating, altering,
and dropping tables and establish constraints.
2. Data manipulation language (DML): commands that maintain and query a database.
3. Data control language (DCL): commands that control a database, including administration
privileges and committing data.
SQL environment simplified illustration:
How languages are mapped: