Summary: Business Intelligence [HMH25g] – Axel Temmerman
Business Intelligence
Summary (2024 - 2025)
Part 1 – SQL............................................................................................................................2
Basics of SQL.....................................................................................................................2
Joins, Group By & Having.................................................................................................. 8
SQL: Additional Topics..................................................................................................... 15
1. Union Queries........................................................................................................ 15
2. ‘Top[number]’.......................................................................................................... 16
3. Subqueries............................................................................................................. 17
4. Rollup and Cube.....................................................................................................19
Part 2 – Data Analytics........................................................................................................ 21
Introduction to Data Analytics...........................................................................................21
Supervised learning: Classification.................................................................................. 34
Classification vs. Regression..................................................................................... 34
Classification Algorithms............................................................................................ 34
Decision Trees............................................................................................................34
The Baseline: Zero Attributes..................................................................................... 41
The problem of Overfitting.......................................................................................... 42
k-Nearest Neighbors (kNN)........................................................................................ 49
Naïve Bayes............................................................................................................... 54
Neural Networks (Multilayer Perceptron)................................................................... 58
Feature Selection............................................................................................................. 67
Feature selection – a definition.................................................................................. 67
Reasons for feature selection.....................................................................................67
Feature selection methods......................................................................................... 67
Unsupervised Learning.................................................................................................... 73
Association Rules....................................................................................................... 73
Clustering................................................................................................................... 80
1
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Part 1 – SQL
Basics of SQL
In the Architectural Perspective of an
Information System, SQL is used to create
queries and reports to the data warehouse,
OLAP and enables data mining.
⇒ SQL is focussed on retrieving data out
of the data-warehouse
SQL (Structured Query Language): Objective
SQL is a database-software-independent language that allows the user/designer to
perform the following three operations:
● Create a relational database (both operational databases and data
warehouses)
● Load a relational database
● Question (query) a relational database
SQL consists of two parts:
● The instructions for creating the database structure (logical and physical
model) is called a Data Definition Language (DDL)
● The instructions to enter, retrieve and update data is called a Data
Manipulation Language (DML)
In this chapter, we mainly use SQL as DML
SQL is a non-procedural language → you specify what you want instead of how you
want to obtain it. The database management system (DBMS) will itself interpret the
SQL instruction and show the results. SQL is suitable for any DBMS!
“Data-Warehouse” we will use during this
course: AdventureDW_Micro1000Facts.odb
2
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
General SQL Select Instruction (Statement)
A select query/instruction:
- Used to retrieve data from an existing DB or DWH (data-warehouse)
- Consists of several clauses
- 2 clauses are required: Select and From, rest is optional
- Order of clauses is always the same (ex: ORDER BY is always the last one,
HAVING is always after GROUP BY and cannot be used if GROUP BY isn’t
there)
The Basic SELECT instruction: simple example
We have a dimension table with customer data (DimCustomer)
→ We want a list with the name of the customers with more than 3 kids
Selection Conditions
To select certain records from a table, we use the “WHERE” clause in SQL
statements.
The “Where” clause always comes after the “From” clause
→ e.g. Display family name and email address of all customers who are married
The where condition can be True or False
SELECT Dimcustomer.Lastname, Dimcustomer.Emailaddress
FROM Dimcustomer
WHERE Dimcustomer.Maritalstatus = ‘M’;
If we have only 1 table, we can omit the table name before the field names:
SELECT Lastname, Emailaddress
FROM Dimcustomer
Where Maritalstatus = ‘M’
3
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Several comparison operators can be used:
Example: Show all products with a catalog price (list price) between $30 and $100
SELECT *
FROM Dimproduct
WHERE Listprice BETWEEN 30 AND 100
* = All fields from a table
Result: A list of articles between $30 and $100, including 30 and 100!
Examples
Show all details of customers who were born on January 15, 1950 or on 15/01/1970
In MS Access:
SELECT *
FROM Dimcustomer
WHERE BirthDate In (#1/15/1950#,#1/15/1970#)
In LibreOffice:
SELECT *
FROM Dimcustomer
WHERE Birthdate IN (‘1950-01-15’,’1970-01-15’)
Show all employees whose phone number is unknown
SELECT *
FROM Dimemployee
WHERE phone IS NOT NULL
4
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur axel_temmerman. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €15,49. Vous n'êtes lié à rien après votre achat.