This is a summary for the course Business Intelligence and Data Management of the Master Information Management. It covers all the course material as discussed in the lectures by professor Caron & Smits. It contains all relevant exam material including some examples of SQL. Also the answers of the ...
Business Intelligence and Data Management - Summary Book
All for this textbook (3)
Written for
Tilburg University (UVT)
Information Management
Business Intelligence & Data management (320092)
All documents for this subject (4)
3
reviews
By: boypieters83 • 4 year ago
By: benwillems • 4 year ago
By: alex-engelen • 5 year ago
Seller
Follow
darykr
Reviews received
Content preview
Summary – Business Intelligence and Data Management 2019 – by Darya Krapyva
Content
Lecture 1 – Intro to BI + Data management ................................................................................. 2
Lecture 2 – Data Warehousing ........................................................................................................ 5
Lecture 3 – OLAP business databases & reporting ................................................................ 12
Lecture 4 – Data mining introduction (CH 1, 2, 3) .................................................................... 18
Lecture 5 – Regression Analysis (CH2, 6) ................................................................................. 22
Lecture 6 – Classification with k nearest neighbors (CH 7) .................................................. 24
Lecture 7 – Classification with Naive Bayes (CH8) ................................................................. 27
Lecture 8 – Performance measures (CH5) ................................................................................. 30
Lecture 9 – Decision trees (CH9) .................................................................................................. 33
Lecture 10 – Association rules (CH14) ....................................................................................... 38
Lecture 11 – Clustering (CH 15) .................................................................................................... 42
SQL LAB SESSIONS PART A, B, C, D - ANSWERS ................................................................. 46
Shmueli, Galit, Patel, Nitin R, and Peter C. Bruce, Data-Mining for Business Analytics, Wiley, 2016, ISBN 9781118729274.
, Created by: Darya Krapyva - 2019
Lecture 1 – Intro to BI + Data management
Part 1: Introduction
Data Management – Managing data as a valuable resource.
Business Intelligence – Data driven decision making. Transforming data into meaningful information/knowledge to support
business decision-making.
Data, information and knowledge
Data – items that are the most elementary descriptions of things, events, activities and transactions (Raw symbols).
They can come in both a structured and unstructured from, and Internal or external.
Information – organized data that has meaning and value (Formatted data). It is the result of processing raw data to reveal its
meaning.
Knowledge – processed data or information that is applicable to a business decision problem (Data relationships).
Taxonomy of Business Intelligence: Methods
1. Descriptive analytics – Use data to understand past & present (OLAP, DBM and Data warehousing framework).
KPIs are often put into a dashboard view to provide (real-time) insights.
2. Predictive analytics – Predict future behaviour based on past performance (regression and clustering).
3. Prescriptive analytics - Make decisions or recommendations to achieve the best performance.
Taxonomy of Business Intelligence: Function - Marketing analytics, Sales analytics, HR analytics, Financial analytics etc.
Part 2: Introduction to Business Intelligence
From DSS to Intelligence or Analytics (two views):
There are two views on this subject:
1. Business Intelligence: data warehousing and descriptive analytics.
Business Analytics: predictive and prescriptive analytics
2. “Within this course, Business Intelligence = Business Analytics.”
BI is an umbrella term that combines the processed technologies, and tools needed to transform data into information,
information into knowledge, and knowledge into plans that drive profitable business action (Sharda 2014). → process definition.
Another definition is that it is information and knowledge that enables business decision-making (Sabherwal, 2011) → product
definition
The objective of the BI product is to provide historical, current and predictive
views of business operations. Information/knowledge that could relate to:
- Understanding customer preferences
- Coping with competition
- Identifying with opportunities
- Enhancing internal efficiency
BI Solution – Support the BI process by utilizing BI tools.
BI product - information and knowledge that supports decision making.
BI tools - data warehousing, knowledge management and statistics.
A performance dashboard is a combination of techniques.
2
, Created by: Darya Krapyva - 2019
Part 3: Introduction to Databases
Database – A collection of related tables, designed, maintained and utilized by multiple users, with software to update & query
the data. Manipulation of data is possible using query language. As the data is divided in smaller proportions, it is important to
connect the smaller proportions (joining tables).
A database consists of the following Database Elements: Data (the database), software, hardware and users.
Database management system (DBMS) is the software that controls the data (Oracle, DB2, mySQL). It manages the data
within the database and offers ways to manipulate the data using query language.
The DBMS contains a data dictionary that can look up the required date component structure and also offers you the possibility
to change this. It also creates and manages the complex structures required for data storage and helps with performance tuning
(increasing efficiency) with the multiple physical data files present.
“As you can see, the DBMS receives Structured Querying Language (SQL) queries from the client and accesses the database
for file access. As a result, data is transported from the database to the requesting client.”
Database systems allow users to:
1. Organise (CREATE)
2. Store (INSERT)
3. Update (UPDATE)
4. Delete (DELETE)
5. Retrieve (SELECT)
Database Terminology – A database consists of separate tables with their uniquely defines names (employees, customers
and orders). These tables form a structures list of data of a specific type. Every Table is divided into Fields (columns) and
Records (rows). A table is a structured list of data of a specific type with a name.
3
, Created by: Darya Krapyva - 2019
Part 4: Relational Databases
Relational Databases allow data to be grouped into tables + to subsequently set relationships between these tables. You can
use a Join line to link different tables using a common field. Such a Join line indicates the relationship between two tables
(customers and orders).
Keys are important as they establish relationships among tables, and they ensure the integrity of data. There are different kind
of keys to be found in tables:
• Primary key (PK) – Fields that uniquely identifies each record in a table (can never be null) In a relational table draft,
primary keys are underlined. Bno (Book), Rno (Reader), Bno + Rno + Load date (Loan)
• Keys – consist of 1 or more attributes that determine other attributes. Key’s role is based on determination: A → B, C,
D. If you know A, you can lookup B, C, D (so these are functionally dependent on A)
• Composite key – a key that is composed of more than one key attribute.
• Composite Primary key – if you need a combination of two tables for the new table (book and reader to determine
load data).
• Super key – Any key that uniquely identifies each row (Author, Title, Bno)
• Candidate key – Super key without unnecessary attributes (Bno)
• Relational scheme – Textual representation of the database tables. Primary key attributes are underlined. (Book
(Bno, Author, Title – Loan (Bno, Rno, Loan date, Return date)
• Foreign key - Attribute whose values match the primary key values in the related table.
• Secondary key - key strictly used for data retrieval (does not need to yield a unique outcome).
Class exercise – identify the Primary Key and Foreign Key
“In short, the Primary Key is the candidate key chosen to be the unique row identifier.
The choice of a PK is based on the designer or end-user requirements. Each primary key value must be unique to ensure the
entity integrity (null not permitted in PK).”
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller darykr. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.54. You're not tied to anything after your purchase.