Machine Learning (Data Mining) - Samenvatting (slides en handboek)
Business Intelligence Samenvatting (HW Ugent) - (19/20!! EXAMEN)
Alles voor dit studieboek (32)
Geschreven voor
Tilburg University (UVT)
MSc. Strategic Management
Strategy Analytics
Alle documenten voor dit vak (8)
Verkoper
Volgen
ayra1999
Ontvangen beoordelingen
Voorbeeld van de inhoud
Data Science for Business Book Summary
Strategy Analytics – Chapter 1
Data mining is used for general customer relationship management to analyze customer
behaviour to manage attrition and maximize expected customer value.
Data science: a set of fundamental principles that guide the extraction of knowledge from
data.
Data mining: the extraction of knowledge from data, via technologies that incorporate these
principles.
Churn: customers switching from one company to another.
The ultimate goal of data science is improving decision making.
DDD (Data-driven Decision making) – increases productivity. The more data-driven, the more
productive.
Sort of decisions that we are interested in fall into two types:
1. Decisions for which ‘discoveries’ need to be made within data.
2. Decisions that repeat, especially at a massive scale, and so decision-making can
benefit from even small increases in decision-making accuracy based on data
analysis.
Big data: datasets that are too large for original data processing systems, and therefore
require new processing technologies. using big data technologies is associated with
significant additional productivity growth.
One of the fundamental principles of data science: data, and the capability to extract useful
knowledge from data, should be regarded as key strategic assets. Teams to analyze and have
the data available are complementary.
The fundamental concepts of data science
A. Extracting useful knowledge from data to solve business problems can be treated
systematically by following a process with reasonable well-defined stages.
B. From a large mass of data, information technology can be used to find informative
descriptive attributes of entities of interest.
C. If you look too hard at a set of data, you will find something – but it might not
generalize beyond the data you are looking at.
D. Formulating data mining solutions and evaluating the results involves thinking
carefully about the context in which they will be used.
, Chapter 2
FC: A set of canonical data mining tasks; the data mining process; supervises vs unsupervised
data mining.
Data mining is a process with fairly well-understood stages.
Different types of tasks are addressed by algorithms. We will now discuss classification and
regression tasks.
1. Classification and class probability estimation: attempt to predict, for each individual
in a population, which of a (small) set of classes this individual belongs to. The
classes are mutually exclusive.
2. Regression: attempts to estimate/predict for each individual, the numerical value of
some variable for that individual. Regression is related to classification, but
classification predicts whether something will happen, whereas regression predicts
how much something will happen.
3. Similarity matching attempts to identify similar individuals based on data known
about them.
4. Clustering attempts to group individuals in a population together by their similarity,
but not driven by any specific purpose.
5. Co-occurrence grouping (aka frequent itemset mining, association rule discovery and
market-basket analysis) attempts to find associations between entities based on
transactions involving them.
6. Profiling (aka behaviour description) attempts to characterize the typical behaviour
of an individual, group or population.
7. Link prediction attempts to take a large set of data and replace it with a smaller set
of data that contains much of the important information in the larger set. Includes
the loss of information, but has a trade-off for improved insight.
8. Causal modelling attempts to help us understand what events or actions influence
others.
Unsupervised: when there is no specific purpose or target specified for the grouping.
Supervised: a specific target is defined, f.e. ‘will a customer leave when her contract
expires?’ another condition for supervised data is that there must be data on the target.
classification, regression, and causal modelling generally are solved with supervised
methods. Matching, link prediction, and data reduction could be either.
clustering, co-occurrence grouping, and profiling generally are unsupervised.
Two main subclasses of supervised data mining – classification and regression – are
distinguished by the type of target.
Regression – numerical target (How much will this customer use the service?)
Classification – categorial (often binary) target (Will this customer purchase S1 if given
incentive 1?)
In business applications, we often want a numerical prediction over a categorical target.
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
√ Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper ayra1999. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €8,09. Je zit daarna nergens aan vast.