Summary: Business Intelligence [HMH25g] – Axel Temmerman
Business Intelligence
Summary (2024 - 2025)
Part 1 – SQL............................................................................................................................2
Basics of SQL.....................................................................................................................2
Joins, Group By & Having.................................................................................................. 8
SQL: Additional Topics..................................................................................................... 15
1. Union Queries........................................................................................................ 15
2. ‘Top[number]’.......................................................................................................... 16
3. Subqueries............................................................................................................. 17
4. Rollup and Cube.....................................................................................................19
Part 2 – Data Analytics........................................................................................................ 21
Introduction to Data Analytics...........................................................................................21
Supervised learning: Classification.................................................................................. 34
Classification vs. Regression..................................................................................... 34
Classification Algorithms............................................................................................ 34
Decision Trees............................................................................................................34
The Baseline: Zero Attributes..................................................................................... 41
The problem of Overfitting.......................................................................................... 42
k-Nearest Neighbors (kNN)........................................................................................ 49
Naïve Bayes............................................................................................................... 54
Neural Networks (Multilayer Perceptron)................................................................... 58
Feature Selection............................................................................................................. 67
Feature selection – a definition.................................................................................. 67
Reasons for feature selection.....................................................................................67
Feature selection methods......................................................................................... 67
Unsupervised Learning.................................................................................................... 73
Association Rules....................................................................................................... 73
Clustering................................................................................................................... 80
1
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Part 1 – SQL
Basics of SQL
In the Architectural Perspective of an
Information System, SQL is used to create
queries and reports to the data warehouse,
OLAP and enables data mining.
⇒ SQL is focussed on retrieving data out
of the data-warehouse
SQL (Structured Query Language): Objective
SQL is a database-software-independent language that allows the user/designer to
perform the following three operations:
● Create a relational database (both operational databases and data
warehouses)
● Load a relational database
● Question (query) a relational database
SQL consists of two parts:
● The instructions for creating the database structure (logical and physical
model) is called a Data Definition Language (DDL)
● The instructions to enter, retrieve and update data is called a Data
Manipulation Language (DML)
In this chapter, we mainly use SQL as DML
SQL is a non-procedural language → you specify what you want instead of how you
want to obtain it. The database management system (DBMS) will itself interpret the
SQL instruction and show the results. SQL is suitable for any DBMS!
“Data-Warehouse” we will use during this
course: AdventureDW_Micro1000Facts.odb
2
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
General SQL Select Instruction (Statement)
A select query/instruction:
- Used to retrieve data from an existing DB or DWH (data-warehouse)
- Consists of several clauses
- 2 clauses are required: Select and From, rest is optional
- Order of clauses is always the same (ex: ORDER BY is always the last one,
HAVING is always after GROUP BY and cannot be used if GROUP BY isn’t
there)
The Basic SELECT instruction: simple example
We have a dimension table with customer data (DimCustomer)
→ We want a list with the name of the customers with more than 3 kids
Selection Conditions
To select certain records from a table, we use the “WHERE” clause in SQL
statements.
The “Where” clause always comes after the “From” clause
→ e.g. Display family name and email address of all customers who are married
The where condition can be True or False
SELECT Dimcustomer.Lastname, Dimcustomer.Emailaddress
FROM Dimcustomer
WHERE Dimcustomer.Maritalstatus = ‘M’;
If we have only 1 table, we can omit the table name before the field names:
SELECT Lastname, Emailaddress
FROM Dimcustomer
Where Maritalstatus = ‘M’
3
, Summary: Business Intelligence [HMH25g] – Axel Temmerman
Several comparison operators can be used:
Example: Show all products with a catalog price (list price) between $30 and $100
SELECT *
FROM Dimproduct
WHERE Listprice BETWEEN 30 AND 100
* = All fields from a table
Result: A list of articles between $30 and $100, including 30 and 100!
Examples
Show all details of customers who were born on January 15, 1950 or on 15/01/1970
In MS Access:
SELECT *
FROM Dimcustomer
WHERE BirthDate In (#1/15/1950#,#1/15/1970#)
In LibreOffice:
SELECT *
FROM Dimcustomer
WHERE Birthdate IN (‘1950-01-15’,’1970-01-15’)
Show all employees whose phone number is unknown
SELECT *
FROM Dimemployee
WHERE phone IS NOT NULL
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller axel_temmerman. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $16.60. You're not tied to anything after your purchase.