Summary of the subject Machine Learning which was given in year 3 of the education Applied Computer Science. This document contains following chapters: Designing BI solutions with Microsoft SQL Server, Analysis Services Tabular Modeling, Multidimensional Modeling, Data Mining, Integration Services ...
Chapter 2: Designing BI solutions with
Microsoft SQL Server
Components of a BI infrastructure
Data sources
OLTP Databases, Legacy system, back office applications, ERP, CRM, accounting apps, flat files or any
kind of database that users adopt in order to manage business.
ETL
• Extract from sources
• Transform schema & content
• Load into destination
The ETL process is a strictly technical domain. This is where tables are deleted, added and modified.
Temporary tables are stored in a staging area, these tables are useless at the end of the process and
cannot be seen by users. Example of an ETL tool: SSIS.
Data cleansing
• Data value validation
• Duplicate record matching
Master data management
• Business entity integrity
JDK 2021 1
,Data Warehouse
Is the database that will contain all the tables,
views, procedures and code that end-users will
use for their daily reporting, dashboarding and
analytical activities.
Querying is more important than inserts/
updates/deletes.
According to Kimball, DWH is the union of all the
data marts.
According to Inmon, DWH is a relational model
in 3rd normal form of the corporate data model.
Data marts source their info from the EDW
(Enterprise data warehouse)
There cannot be one clear definition of a data
warehouse, the content of the data mart highly
depends upon the complexities of the specific BI.
ODS = operational data store
SODA = staging/ODS/Archive
A data mart contains a subset of organization-wide data. This subset of data is valuable to a specific
community of knowledge workers.
For example, the marketing data mart may contain data related to products, customers and sales and will
be used by the marketing analysts.
The best way to model a data mart is a star schema or snowflake schema with a fact table surrounded by
dimension tables.
Kimball vs Inmon
- There is no right or wrong.
- There is no clear separation. The shared idea is dimensional modeling.
- Different data warehousing philosophies.
- DWH in most enterprises are closer to Ralph Kimball's idea. This is because most started out as a
departmental effort as a data mart
- Inmon’s solution takes time and money
JDK 2021 2
, Data models
Benefits of data models:
▪ Abstract data warehouse tables
▪ Simplify analysis for users
▪ Add business logic
▪ Pre-aggregate measures
▪ Provide a standard interface
2 types of models: Multidimensional & Tabular
Data Models are built with SQL Server Analysis Services. SSAS is an extra layer of metadata, or a
semantic model that sits on top of a data warehouse in a relational database.
The layer includes models containing the business logic of your data
A data model contains information about:
- how fact tables and dimension tables should be joined
- how measures should be aggregated
- how users should be able to explore the data through hierarchies
- the definitions of common calculations
End user applications query these models rather than the underlying database.
An LOB (Line of Business) application is one of the sets of critical computer applications that are vital to
running an enterprise. LOB applications are usually large programs that contain a number of integrated
capabilities and tie into databases and database management systems. For example: ERP – CRM
SSAS
• Tabular & multidimensional
• The concepts involved in designing the two types of model are very different, and you cannot
convert a Tabular database into a Multidimensional or vice versa, without rebuilding everything from
scratch
JDK 2021 3
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
√ Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper GraduateITF. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €7,99. Je zit daarna nergens aan vast.