Data
Analytics &
Professional
Skills
Summary of the Lectures
Given in: 2021/2022 | Summary by LouisT
,Contents
Chapter 1. Introduction to Data Analytics ............................................................................................. 3
1.1 Course...................................................................................................................................... 3
1.2 Development .......................................................................................................................... 3
1.3 Managerial Decision Making .................................................................................................. 4
1.3.1 Decision-Making Process ................................................................................................ 4
1.4 Model ...................................................................................................................................... 5
1.4.1 Benefits of Models ................................................................................................................. 5
1.5 Data Vs Information ............................................................................................................... 7
1.6 Business Analytics ................................................................................................................... 8
1.7 Big Data ................................................................................................................................... 9
1.8 What is a Data Scientist? ........................................................................................................ 9
Chapter 2. Data Warehousing & Visual Analytics................................................................................ 11
2.1 What’s a Data Warehouse? .................................................................................................. 11
2.2 Why do we need databases?................................................................................................ 11
2.3 Databases vs Data Warehouse ............................................................................................. 11
2.4 Data Mart .............................................................................................................................. 12
2.5 Data Extraction, Transformation, and Load (ETL)................................................................ 12
2.6 OLTP VS OLAP ....................................................................................................................... 12
2.7 Data Lakes ............................................................................................................................. 14
2.8 Data Visualization ................................................................................................................. 14
Chapter 3. Database Concepts and Data Modelling ............................................................................ 15
3.1 Relational Database .............................................................................................................. 15
3.2 Data Modelling ..................................................................................................................... 15
3.3 Entity-Relationship Modelling .............................................................................................. 15
3.3.1 Cardinalities .................................................................................................................. 17
3.4 Developing E-R Diagrams ..................................................................................................... 21
3.5 Dealing with Big Data ........................................................................................................... 21
Chapter 4. Database Concepts and Data Modelling ............................................................................ 22
4.1 Design Requirement ............................................................................................................. 22
4.2 Implementing Relationships................................................................................................. 22
4.3 SQL Overview (Structured Query Language) ....................................................................... 24
Chapter 5. Data Mining ........................................................................................................................ 26
5.1 Data Mining Characteristics and Objectives ........................................................................ 26
5.2 A Taxonomy for Data Mining Tasks ..................................................................................... 27
A Summary by LouisT on Stuvia.nl | Thank you for your purchase | Version 2.0 (Final)
, 5.3 Data Mining and Statistics .................................................................................................... 27
5.4 Training and Testing Classification Methods ....................................................................... 29
5.5 Cluster Analysis ..................................................................................................................... 32
5.5.1 Why do we have Clustering? ........................................................................................ 32
Chapter 6. Process Mining .................................................................................................................... 34
6.1 System Documentation ........................................................................................................ 34
6.1.1 Why do we need System Documentation?.................................................................. 35
6.2 Process Mining (The Basics) ................................................................................................. 35
6.3 Process Mining (Terminology) .............................................................................................. 35
6.4 Process Mining (Input).......................................................................................................... 35
6.5 Process Mining (Algorithms) ................................................................................................ 36
6.6 Process Mining (Output)....................................................................................................... 37
Chapter 7. Text Mining ......................................................................................................................... 43
7.1 Data mining versus Text mining ........................................................................................... 43
7.2 Text Mining Tasks ................................................................................................................. 43
7.3 Natural Language Processing (NLP)...................................................................................... 44
7.4 Text Mining Process .............................................................................................................. 44
7.4.1 Step 1: Establish the Corpus (Collect data) .................................................................. 44
7.4.2 Step 2: Create the term-by-document Matrix (TDM) .................................................. 44
7.4.3 Step 3: Text Mining ....................................................................................................... 45
4
A Summary by LouisT on Stuvia.nl | Thank you for your purchase | Version 2.0 (Final)
,Chapter 1. Introduction to Data Analytics
1.1 Course
The course is split into two parts, which are Data Analytics and Professional Skills.
Data analytics will not be very theoretical, it will be more about applying the theory.
There will be two tutorials (DA) and three workshops (PS). These will be mandatory and if not
attending, an automatic fail for the exam. Workshops it is advised to install the program beforehand
as this is also important for the final exam. If you have any questions, the contact hours are 5-6 PM
on Wednesday and Thursday.
There will be an example exam before the final exam. The exam will consist of multiple-choice
questions and open questions.
Furthermore, group exercises will be given. You have to Conceptualize and describe a data analysis
solution for a specific type of company, and this must be a recorded group presentation. 3 minutes
for every group member.
Why do we need this course?
Big Data is causing information overload among decision-makers. In addition, someone must be able
to understand the data by using suitable techniques capable of dealing with Big Data via advanced
data analytics. The people with this understanding are currently very lacking in the accounting
profession.
1.2 Development
Many developments have been made in technologies that are affecting accounting. This is because
of BI, Warehouses, and audit firms that have started using data analytics. This means it makes a shift
from traditional accounting to new accounting.
Thanks to these technological innovations, it is possible to have cloud-based services which are very
accessible, from anywhere and saved digitally. The software also took over the majority of
transactions in bookkeeping activities.
3
A Summary by LouisT on Stuvia.nl | Thank you for your purchase | Version 2.0 (Final)
, 1.3 Managerial Decision Making
Decision-making is nearly the same as management as this is a process by which organizational goals
are achieved. Decision-making consists of selecting the best solution from two or more alternatives.
If the management wants to make the right decision, which I bet they do, they require sufficient
information.
1.3.1 Decision-Making Process
Normally four steps are taken to make a decision. These are Intelligence, design, choice, and
implementation. A quick explanation of these four steps:
1. Intelligence
First, we define the problem or opportunity, what or why do we need to make a decision
should be determined beforehand.
2. Design
After we understand why or what decision to make. We must construct a model that
describes the situation and find alternative solutions.
3. Choice
Here we compare, evaluate, and make the decision. Or we advise the manager what to do
and then the manager makes the decision.
4. Implementation
We implement the chosen solution.
4
A Summary by LouisT on Stuvia.nl | Thank you for your purchase | Version 2.0 (Final)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller louist. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.96. You're not tied to anything after your purchase.