Analytics in Accounting & Financial Management
Lecture 1 – Introduction and basic concepts
I. A large manufacturing company considers using more data analytics in its accounting and finance
processes. Give two examples of benefits the company could derive from doing so.
II. Describe the difference between supervised and unsupervised analytics.
Databases and data management
III. Erasmus University sets up a data management system to store all proprietary data used by its
researchers in published studies, accessible to external researchers upon request.
Describe three desirable attributes of such a data management system
Relational and non-relational databases
IV. Describe two benefits of relational databases
V. Evaluate the following SQL query and describe in words what it does: …
VI. Evaluate the following SQL query and describe whether it correctly calculates average EPS per
industry: …
Benefits of data analytics in AFM
• More optimal use of available data
• Better insight into the business, its environment, and its competition
• Improved financial and accounting decisions
• Faster decision making
• Improved cost or process efficiency
• Improved forecasting
• Reduced risks (e.g., fraud risk, financial risk, collection risk, …)
> This course: primarily question-driven, mix of supervised and unsupervised
The 3Vs (volume, variety and velocity) are three defining properties or dimensions of big data. Volume
refers to the amount of data; variety refers to the number of types of data and velocity refers to the
speed of data processing.
Supervised learning uses labelled datasets, whereas unsupervised learning uses unlabeled datasets.
Supervised learning --> These datasets are designed to train or “supervise” algorithms to accurately
classify data or predict outcomes. Using labelled inputs and outputs, the model can measure its
accuracy and learn over time.
Unsupervised learning --> these algorithms discover hidden patterns in data without the need for
humans intervention (hence, they are “unsupervised”)
,Data-driven reasoning involves reasoning from the data to a hypothesis, whereas hypothesis-driven
reasoning involves using a hypothesis to explain the data.
Data: Organizational data management systems
Features Types
• Storage medium • Transaction Processing (TPS)
• Common structure • Management Information (MIS)
• Interface for rapid entry and retrieval • Decision Support (DSS)
• Trade-offs needed • Business Intelligence (BI)
• Online Analytical Processing (OLAP)
• Data Mining (DM)
• Machine Learning (ML)
Types System’s purpose
TPS Transaction processing system Collect and store data from routine transactions
MIS Management information Convert data from a TPS into information for planning,
system controlling, and managing an organization
DSS Decision support Support managerial decision-making by providing models
for processing and analyzing data
BI Business intelligence Gather, store, and analyze data to improve decision
making
OLAP Online analytical processing Provide a multidimensional view of data
DM Data mining Use statistical analysis and artificial intelligence
techniques to identify hidden relationships in data
ML Machine learning Using software to make decisions or recommendations
traditionally made by humans.
Data: Desirable attributes and problems of data management systems
Desirable attributes Problems
• Shareable • Redundancy
• Transportable • Lack of data control
• Securer • Poor interface
• Accurate • Delays
• Timely • Lack of reality
• Relevant • Lack of data integration
Desirable attributes of data
Shareable Readily accessed by more than one person at a time.
Transportable Easily moved to a decision-maker
Secure Protected from destruction and unauthorized use
Accurate Reliable, precise records
Timely Current and up-to-date
Relevant Appropriate to the decision
Problems
Redundancy The same data are stored in different systems.
Lack of data control Data are poorly managed
Poor interface Data are difficult to access
Delays There are frequently delays following requests for reports
Lack of reality Data management systems do not reflect the complexity of the real world
Lack of data Data are dispersed across different systems
integration
,Creating a table in SQL
> Remember the primary data formats: character (text), Date, Time, Integer, Decimal, and
Boolean
Joining tables in SQL - types
Data lakes and warehouses support analytical processing
• Focus: BI, OLAP, DM, ML
Model driven storage --> Data warehouse --> Data mart
Data driven storage --> Data lake --> Data reservoir
Exploring database types
- When data can be sufficiently structured, relational model, using SQL as a relational database
language are popular. Some benefits:
o Scalability
o Fast processing
o Often embedded into other programming languages
- Understanding SQL helps you improve your intuition for data wrangling
- With less structure in the data (e.g., varying attributes in a product database), NoSQL or Non-
relational databases avoid using a fixed data mode.
, In an SQL (‘structured query language’) context, this table (‘worksheet’) is a ‘relation’.
• Columns (attributes) have unique names; rows (tupples) unique identities
• The number of columns (rows) = the degree (cardinality) of the relation
Non-relational databases
• Non-relational databases store data in a non-tabular form
o Avoids using a fixed data model
o Leading to the existence of many different types
• Non-relational database gained relevance over time because of
o The decreasing cost of storage
o The increasing amounts and forms of data
• Main advantages: flexibility, scalability, fast querying