Summary: Business Intelligence [HMH25g] – Axel Temmerman
Business Intelligence
Summary (2024 - 2025)
Part 1 – SQL............................................................................................................................2
Basics of SQL.....................................................................................................................2
Joins, Group By & Having.................................................................................................. 8
SQL: Additional Topics..................................................................................................... 15
1. Union Queries........................................................................................................ 15
2. ‘Top[number]’.......................................................................................................... 16
3. Subqueries............................................................................................................. 17
4. Rollup and Cube.....................................................................................................19
Part 2 – Data Analytics........................................................................................................ 21
Introduction to Data Analytics...........................................................................................21
Supervised learning: Classification.................................................................................. 34
Classification vs. Regression..................................................................................... 34
Classification Algorithms............................................................................................ 34
Decision Trees............................................................................................................34
The Baseline: Zero Attributes..................................................................................... 41
The problem of Overfitting.......................................................................................... 42
k-Nearest Neighbors (kNN)........................................................................................ 49
Naïve Bayes............................................................................................................... 54
Neural Networks (Multilayer Perceptron)................................................................... 58
Feature Selection............................................................................................................. 67
Feature selection – a definition.................................................................................. 67
Reasons for feature selection.....................................................................................67
Feature selection methods......................................................................................... 67
Unsupervised Learning.................................................................................................... 73
Association Rules....................................................................................................... 73
Clustering................................................................................................................... 80
1
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Part 1 – SQL
Basics of SQL
In the Architectural Perspective of an
Information System, SQL is used to create
queries and reports to the data warehouse,
OLAP and enables data mining.
⇒ SQL is focussed on retrieving data out
of the data-warehouse
SQL (Structured Query Language): Objective
SQL is a database-software-independent language that allows the user/designer to
perform the following three operations:
● Create a relational database (both operational databases and data
warehouses)
● Load a relational database
● Question (query) a relational database
SQL consists of two parts:
● The instructions for creating the database structure (logical and physical
model) is called a Data Definition Language (DDL)
● The instructions to enter, retrieve and update data is called a Data
Manipulation Language (DML)
In this chapter, we mainly use SQL as DML
SQL is a non-procedural language → you specify what you want instead of how you
want to obtain it. The database management system (DBMS) will itself interpret the
SQL instruction and show the results. SQL is suitable for any DBMS!
“Data-Warehouse” we will use during this
course: AdventureDW_Micro1000Facts.odb
2
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
General SQL Select Instruction (Statement)
A select query/instruction:
- Used to retrieve data from an existing DB or DWH (data-warehouse)
- Consists of several clauses
- 2 clauses are required: Select and From, rest is optional
- Order of clauses is always the same (ex: ORDER BY is always the last one,
HAVING is always after GROUP BY and cannot be used if GROUP BY isn’t
there)
The Basic SELECT instruction: simple example
We have a dimension table with customer data (DimCustomer)
→ We want a list with the name of the customers with more than 3 kids
Selection Conditions
To select certain records from a table, we use the “WHERE” clause in SQL
statements.
The “Where” clause always comes after the “From” clause
→ e.g. Display family name and email address of all customers who are married
The where condition can be True or False
SELECT Dimcustomer.Lastname, Dimcustomer.Emailaddress
FROM Dimcustomer
WHERE Dimcustomer.Maritalstatus = ‘M’;
If we have only 1 table, we can omit the table name before the field names:
SELECT Lastname, Emailaddress
FROM Dimcustomer
Where Maritalstatus = ‘M’
3
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Several comparison operators can be used:
Example: Show all products with a catalog price (list price) between $30 and $100
SELECT *
FROM Dimproduct
WHERE Listprice BETWEEN 30 AND 100
* = All fields from a table
Result: A list of articles between $30 and $100, including 30 and 100!
Examples
Show all details of customers who were born on January 15, 1950 or on 15/01/1970
In MS Access:
SELECT *
FROM Dimcustomer
WHERE BirthDate In (#1/15/1950#,#1/15/1970#)
In LibreOffice:
SELECT *
FROM Dimcustomer
WHERE Birthdate IN (‘1950-01-15’,’1970-01-15’)
Show all employees whose phone number is unknown
SELECT *
FROM Dimemployee
WHERE phone IS NOT NULL
4
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
√ Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper axel_temmerman. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €15,49. Je zit daarna nergens aan vast.