Summary Business Analytics & Decision Support (1BVK00) - ALL lectures and ALL reading material
13 vues 0 fois vendu
Cours
Business Analytics & Decision Support (1BVK00)
Établissement
Technische Universiteit Eindhoven (TUE)
This summary includes all the lectures and reading material of the course in 2022. Note that no pictures of the slides are included in the file because of legislations by the TU/e.
1BVK00 Summary (2021-2022)
Table of Contents
Introduction to Data Analytical Thinking ...................................................................................................... 2
Business Problems & Data Science Solutions ................................................................................................ 3
Predictive Modeling: Fitting Model to Data .................................................................................................. 5
Overfitting and its Avoidance ........................................................................................................................ 7
Visualizing Predictive Model Performance ................................................................................................. 10
Similarity and Clustering.............................................................................................................................. 12
Evidence and Probabilities .......................................................................................................................... 14
Fuzzy Logic and Decision Making ................................................................................................................ 15
Fuzzy Cognitive Maps and Decision Making................................................................................................ 18
Interpretability of Decision Models............................................................................................................. 19
,Introduction to Data Analytical Thinking
Business analytics & decision support provides:
1. Data-driven approach for each decision type
a. Strategical: unstructured, one-time decisions
b. Tactical: semi-structured, reporting decisions
c. Operational: structured, recurrent decisions
2. A structured way of dealing with the decision problems
• Data science: the practice of organizing and analyzing data to gain insights that may prove helpful for
human decision-making
- Interdisciplinary areas:
▪ Artificial intelligence: how computers and machines can demonstrate intelligent behavior
▪ Machine learning: a subcategory of AI that enables computer algorithms to automatically
learn from data
• Data-driven decision-making (DDD): the practice of basing decisions on the analysis of data, rather
than purely on intuition
- The more data-driven a firm is, the more productive it is
- Automatic DDD: automatic decision making done by computer systems
• Data, and the capability to extract useful knowledge from data, should be regarded as key strategic
assets
CRISP-DM methodology: Cross Industry Standard Process for Data Mining
• Iterates on approaches and strategy rather than on software design
• The results of a given step may change the fundamental understanding of the problem
Business understanding: Modeling:
• Business objectives • Select modeling
• Success criteria (KPI) techniques
• Project plan • Build/train model
• Deliverables • Prediction
Data Understanding: Evaluation:
• Initial data collection • Model validation
• Data description • Performance
• Data exploration metrics
Data preparation: • Visualization
• Data cleaning • Review results
• Sampling Deployment:
• Normalization • Model in
• Feature Selection production
Big data: datasets that are too large for traditional data processing systems, and therefore requires new
processing technologies
• We are in the era of Big Data 1.0 because firms are busying themselves with building the capabilities
to process large data
• When big Data 2.0 will arrive firms should begin asking ‘What can I now do that I couldn’t do before,
or do better than I could do before?’
Canonical data mining tasks
, • Supervised: when a specific purpose or target is specified for grouping, and there is data on the
target
- Classification: Determine which discrete category the example is (categorical)
▪ Class probability estimation: model the probability that something will happen
- Regression: attempts to estimate or predict, for each individual, the numerical value of some
variable for that individual (numerical/probability)
- Causal modeling
• Unsupervised: When no specific purpose or target is specified for grouping
- Clustering: attempts to group individuals in a population together by their similarity, but not
driven by any specific purpose
- Co-occurrence grouping: attempts to find associations between entities based on transactions
involving them
- Profiling: attempts to characterize the typical behavior of an individual, group, or population
• Either supervised or unsupervised:
- Link prediction: attempts to predict connections between data items, usually by suggesting that a
link should exist, and possibly also estimating the strength of the link
- Similarity matching: attempts to identify similar individuals based on data known about them
There is another important distinction pertaining to mining data, namely
1. Mining the data to find patterns and build models
2. Using the results of data mining
Analytical techniques and technologies
• Statistics
- Summary statistics: the computation of particular numeric values of interest from data
- Statistics (the field): provides us with a huge amount of knowledge that underlies analytics and
can be thought of as a component of the larger field of Data Science
• Database querying
- Query: a specific request for a subset of data or for statistics about data, formulated in a
technical language and posed to a database system
• Data warehousing: collect and coalesce data from across an enterprise, often form multiple
transaction-processing systems, each with its own database
• Regression analysis: explanatory modeling and predictive modeling have a considerable overlap in
the techniques used, but the lessons learned from explanatory modeling do not apply to predictive
modeling
Business Problems & Data Science Solutions
Building good datasets
Garbage in, garbage out!: bad quality data will result in bad quality mining results
Issues affecting data quality:
• Uniqueness • Missing values • Misspellings
• Formats • Invalid values
• Attribute dependencies • Misfielded values
How to detect these issues:
• Visualization: visualizing all the values of each feature, or taking a random sample to see if it’s right
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur anneTBKIM. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €7,49. Vous n'êtes lié à rien après votre achat.