Summary of the subject Machine Learning which was given in year 3 of the education Applied Computer Science. This document contains following chapters: Designing BI solutions with Microsoft SQL Server, Analysis Services Tabular Modeling, Multidimensional Modeling, Data Mining, Integration Services ...
Chapter 2: Designing BI solutions with
Microsoft SQL Server
Components of a BI infrastructure
Data sources
OLTP Databases, Legacy system, back office applications, ERP, CRM, accounting apps, flat files or any
kind of database that users adopt in order to manage business.
ETL
• Extract from sources
• Transform schema & content
• Load into destination
The ETL process is a strictly technical domain. This is where tables are deleted, added and modified.
Temporary tables are stored in a staging area, these tables are useless at the end of the process and
cannot be seen by users. Example of an ETL tool: SSIS.
Data cleansing
• Data value validation
• Duplicate record matching
Master data management
• Business entity integrity
JDK 2021 1
,Data Warehouse
Is the database that will contain all the tables,
views, procedures and code that end-users will
use for their daily reporting, dashboarding and
analytical activities.
Querying is more important than inserts/
updates/deletes.
According to Kimball, DWH is the union of all the
data marts.
According to Inmon, DWH is a relational model
in 3rd normal form of the corporate data model.
Data marts source their info from the EDW
(Enterprise data warehouse)
There cannot be one clear definition of a data
warehouse, the content of the data mart highly
depends upon the complexities of the specific BI.
ODS = operational data store
SODA = staging/ODS/Archive
A data mart contains a subset of organization-wide data. This subset of data is valuable to a specific
community of knowledge workers.
For example, the marketing data mart may contain data related to products, customers and sales and will
be used by the marketing analysts.
The best way to model a data mart is a star schema or snowflake schema with a fact table surrounded by
dimension tables.
Kimball vs Inmon
- There is no right or wrong.
- There is no clear separation. The shared idea is dimensional modeling.
- Different data warehousing philosophies.
- DWH in most enterprises are closer to Ralph Kimball's idea. This is because most started out as a
departmental effort as a data mart
- Inmon’s solution takes time and money
JDK 2021 2
, Data models
Benefits of data models:
▪ Abstract data warehouse tables
▪ Simplify analysis for users
▪ Add business logic
▪ Pre-aggregate measures
▪ Provide a standard interface
2 types of models: Multidimensional & Tabular
Data Models are built with SQL Server Analysis Services. SSAS is an extra layer of metadata, or a
semantic model that sits on top of a data warehouse in a relational database.
The layer includes models containing the business logic of your data
A data model contains information about:
- how fact tables and dimension tables should be joined
- how measures should be aggregated
- how users should be able to explore the data through hierarchies
- the definitions of common calculations
End user applications query these models rather than the underlying database.
An LOB (Line of Business) application is one of the sets of critical computer applications that are vital to
running an enterprise. LOB applications are usually large programs that contain a number of integrated
capabilities and tie into databases and database management systems. For example: ERP – CRM
SSAS
• Tabular & multidimensional
• The concepts involved in designing the two types of model are very different, and you cannot
convert a Tabular database into a Multidimensional or vice versa, without rebuilding everything from
scratch
JDK 2021 3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller GraduateITF. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.48. You're not tied to anything after your purchase.