Business Intelligence and Data management - Summary
Business Intelligence and Data Management - Summary Book
All for this textbook (3)
Written for
Tilburg University (UVT)
Information Management
Business Intelligence & Data Management
All documents for this subject (4)
1
review
By: DanaiBachari • 4 year ago
Translated by Google
Very concise
Seller
Follow
marobo
Reviews received
Content preview
Index
Articles Bi DM (Week 1-3).......................................................................................................................4
Art.1 Database Management (1.1, 1.6, 2.1-2.3, 3.1, 3.2, 3.6, 5.1, 5.2)...............................................4
1. Database systems.......................................................................................................................4
2. Data models................................................................................................................................7
3. The rational Database model......................................................................................................8
5. Normalization of the database tables.......................................................................................11
Art. 2 Data Warehouse Design.........................................................................................................12
1. Introduction to Data warehousing................................................................................................12
1.1 Decision Support Systems.......................................................................................................12
1.2 Data Warehousing..................................................................................................................13
1.3 Data Warehouse Architectures...............................................................................................14
1.3.1 Single-Layer Architecture.....................................................................................................14
1.3.2 Two-Layer Architecture.......................................................................................................14
1.3.3 Three-Layer Architecture.....................................................................................................16
1.3.4 An Additional Architecture Classification.............................................................................16
Art. 3 Multidimensional Database Technology.................................................................................18
BI DM Book (Week 4-11)......................................................................................................................22
Part 1 Priliminaries...........................................................................................................................22
Chapter 1: Introduction....................................................................................................................22
1.1 what is business analytics?.....................................................................................................22
1.3 Data mining and related terms...............................................................................................22
1.4 Big data...................................................................................................................................22
1.7 Terminology and notation......................................................................................................22
Chapter 2: Overview of the data mining process..............................................................................24
2.2 Core ideas in data mining.......................................................................................................24
2.3 The steps in data mining.........................................................................................................24
2.4 preliminary steps....................................................................................................................25
2.5 Predictive power and overfitting............................................................................................26
2.8 Automating data mining solutions..........................................................................................27
Part 2 Data exploration and dimension reduction................................................................................28
Chapter 3: Data Visualization...........................................................................................................28
3.1 Uses of Data visualization.......................................................................................................28
3.2 Data examples........................................................................................................................28
3.3 Basic charts: bar charts, line graphs and scatter plots............................................................28
1
, 3.4 Multidimensional visualization...............................................................................................29
3.5 Specialized visualizations........................................................................................................30
Chapter 4 Dimension Reduction.......................................................................................................31
4.1 Introduction............................................................................................................................31
4.2 Curse of Dimensionality..........................................................................................................31
4.3 Practical considerations..........................................................................................................31
4.4 Data summaries......................................................................................................................31
Part 3 Performance evaluation.............................................................................................................32
Chapter 5: Evaluating predictive performance.................................................................................32
5.1 Introduction............................................................................................................................32
5.2 Evaluating predictive performance.........................................................................................32
5.3 Judging classifier performance...............................................................................................33
5.4 Judging ranking performance.................................................................................................35
5.5 Oversampling..........................................................................................................................35
Part 4 Prediction and classifications methods......................................................................................37
Chapter 6: Multiple linear regression...............................................................................................37
6.1 Introduction............................................................................................................................37
6.2 Explanatory vs. predictive modelling......................................................................................37
6.3 Estimating the regression equation and prediction................................................................37
6.4 Variable selection in linear regression....................................................................................37
Chapter 7: k-Nearest-neighbours (k-NN)..........................................................................................39
7.1 The k-NN classifier (categorical outcome)..............................................................................39
7.2 k-NN for a numerical response...............................................................................................40
7.3 Advantages and shortcomings of k-NN algorithms.................................................................40
Chapter 8: The Naïve Bayes classifier...............................................................................................41
8.1 Introduction............................................................................................................................41
8.2 Applying the full (exact) Bayesian classifier............................................................................41
8.3 Advantages and shortcomings of the Naïve Bayes classifier...................................................41
Chapter 9: Classification and Regression Trees................................................................................43
9.1 Introduction............................................................................................................................43
9.2 Classification trees..................................................................................................................43
9.3 Evaluation the performance of a classification tree................................................................44
9.4 Avoiding overfitting................................................................................................................44
9.5 classification rules from trees.................................................................................................45
9.6 Classification trees for more than two classes........................................................................45
9.7 Regression trees.....................................................................................................................45
2
, 9.8 Advantage, weaknesses, and extensions................................................................................46
9.9 Improving prediction: multiple trees......................................................................................46
Part 5 Mining relationships among records..........................................................................................48
Chapter 14: Association rules and collaborative filtering.................................................................48
14.1 Association rules...................................................................................................................48
14.2 Collaborative filtering...........................................................................................................50
14.3 Summary...............................................................................................................................52
Chapter 15: Cluster analysis.............................................................................................................53
15.1 Introduction..........................................................................................................................53
15.2 Measuring distance between two observations...................................................................53
15.3 Measuring distance between two clusters...........................................................................54
15.4 Hierarchical (agglomerative) clustering................................................................................55
15.5 Non-hierarchical clustering: the k-means algorithm.............................................................56
3
, Articles Bi DM (Week 1-3)
Art.1 Database Management (1.1, 1.6, 2.1-2.3, 3.1, 3.2, 3.6, 5.1, 5.2)
1. Database systems
1.1 Data vs. Information
Data are raw facts. The word raw indicates that the facts have not yet been processed to reveal
their meaning. Keep in mind that raw data must be properly formatted for storage, processing,
and presentation.
Information is the result of processing raw data to reveal its meaning. To reveal meaning,
information requires context.
Data are the foundation of information, which is the bedrock of knowledge—that is, the body of
information and facts about a specific subject. Knowledge implies familiarity, awareness, and
understanding of information as it applies to an environment. A key characteristic of knowledge
is that “new” knowledge can be derived from “old” knowledge.
Let’s summarize some key points:
Data constitute the building blocks of information.
Information is produced by processing data.
Information is used to reveal the meaning of data.
Accurate, relevant, and timely information is the key to good decision making.
Good decision making is the key to organizational survival in a global environment.
Data management is a discipline that focuses on the proper generation, storage, and retrieval of
data.
1.6 Database systems
Unlike the file system, with its many separate and unrelated files, the database system consists
of logically related data stored in a single logical data repository. (The “logical” label reflects the
fact that, although the data repository appears to be a single unit to the end user, its contents
may actually be physically distributed among multiple data storage facilities and/or locations.)
The current generation of DBMS software stores not only the data structures, but also the
relationships between those structures and the access paths to those structures—all in a central
location. Also takes care of defining, storing, and managing all required access paths to those
components.
1.6.1 The database system environment
Database system refers to an organization of components that define and regulate the
collection, storage, management, and use of data within a database environment.
From a general management point of view, the database system is composed of the five major
parts: hardware, software, people, procedures, and data.
1. Hardware: Hardware refers to all of the system’s physical devices.
2. Software: Although the most readily identified software is the DBMS itself, to make the database
system function fully, three types of software are needed: operating system software, DBMS
software, and application programs and utilities.
o Operating system software manages all hardware components and makes it possible for
all other software to run on the computers.
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller marobo. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.35. You're not tied to anything after your purchase.