Samenvatting

Summary Strategy Analytics (Data Science for Business)

Name: Summary Strategy Analytics (Data Science for Business)
SKU: doc_482012
Rating: 3.33 (3 reviews)
Author: tessasnijders

3 beoordelingen

382 keer bekeken 31 keer verkocht

Vak
Strategy Analytics

Instelling
Tilburg University (UVT)

Boek
Data Science for Business

Summary of the course Strategy Analytics, based on the book Data Science for Business by Foster Provost & Tom Fawcett. The summary contains the all the required chapters.

[Meer zien]

Laatste update van het document: 6 jaar geleden

Voorbeeld 5 van de 51 pagina's

Bekijk voorbeeld

Heel boek samengevat? Nee
Wat is er van het boek samengevat? Chapter 1,2,3,4,5,6,7,8,9,10,11,12,13
Geupload op 29 november 2018
Bestand laatst geupdate op 3 december 2018
Aantal pagina's 51
Geschreven in 2018/2019
Type Samenvatting

3 beoordelingen

Door: berkecediz • 5 jaar geleden

Door: sterrevanderpoll • 5 jaar geleden

Door: lardmunten • 6 jaar geleden

Volgen

tessasnijders Lid sinds 8 jaar 34 documenten verkocht

€6,99

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Summary
Strategy Analytics

Data science for Business
Foster Provost & Tom Fawcett

,Chapter 1 Data-Analytic Thinking
Data mining – the extraction of knowledge from data, via technologies that incorporate these
principles. Data mining techniques provide some of the clearest illustrations of the principles
of data science

Data science – a set of fundamental principles that guide the extraction of knowledge from
data

The figure places data in the
context of various other closely
related and data related process
in the organization. It
distinguishes data science from
other aspects of data processing
that are gaining increasing
attention in business.

Data-Driven Decision Making (DDD) – refers to the practice of basing decisions on the analysis
of data, rather than purely on institution. DDD is not an all-or-nothing practice, different firms
engage in DDD to greater or lesser degrees

2 sorts of decisions
1. decisions for which discoveries need to be made within data
2. decisions that repeat, especially at massive scale

The figure shows data science supporting DDD but also overlapping with DDD. This highlight
the often overlooked fact that, increasingly, business decisions are being made automatically
by computer systems.

There is a difference between data science and data-driven businesses.
Data science – needs access to data and it often benefit from sophisticated data engineering
that data processing technologies may facilitate, but these technologies are not data science
technologies per se. They support data science but are (figure) but they are useful for much
more.

2

,Data processing technologies – very important for many data-oriented business tasks that do
not involve extracting knowledge or DDD.

Big data – means datasets that are too large for traditional data processing systems, and
therefore require new processing technologies. Big data technologies are usually used for
implementing data mining techniques and for data processing in support of the data mining
techniques.

Big data 1.0 – firms are busying themselves with building the capabilities to process large
data, largely in support of their current operations, for example improve efficiency

Big data 2.0 – firms started to look further. They began to ask how it could improve things
they’d always done. So, we entered Big data 2.0 where new systems and companies began
taking advantage of their interactive nature of the Web. The most obvious are the
incorporation of social networking components and the rise of the ‘voice’ of the individual
consumer

Data and data science capability as a strategic asset
Data, and the capability to extract useful knowledge from data, should be regarded as key
strategic assets.
Often, we don’t have exactly the right data to best make decisions and/or the right talent to
best support making decisions from the data. Thinking of these as assets should lead us to the
realization that they are complementary. It is often necessary to make investments.

Fundamental concepts of Data science
- Extracting useful knowledge from data to solve business problems can be treated
systematically by following a process with reasonably well-defined stages
o The Cross Industry Standard Process for Data Mining is one codification of this
process. Keeping this in mind provides a framework to structure our thinking
about data analytics problems.
- From a large mass of data, information technology can be used to find informative
descriptive attributes of entities of interest
o For example, a customer would be an entity of interest, and each customer
might be described by a large number of attributes such as, service history. But
how much information is needed? You need to find variables that correlate. A
business analyst might be able to hypothesize and test and there are tools to
facilitate the experimentation
- If you look too hard at a set of data, you will find something – but it might not
generalize beyond the data you’re looking at
o This is referred as overfitting a dataset. Data mining techniques are very
powerful, and the need to detect and avoid overfitting is one of the most
important concepts to grasp when applying data mining to real problems
- Formulating data mining solutions and evaluating the results involves thinking
carefully about the context in which they will be used
o If your goal is the extraction of potentially useful knowledge, how can we
formulate what is useful? It depends on the application is question.

3

,Chapter 2 Business Problems and Data
Science Solutions
Data scientist decompose a business problem into several subtasks. The solutions to the
subtasks can then be composed to solve the overall problem. Some are unique to the
business problem, but others are more common data mining tasks.

Despite the large number of specific data mining algorithms developed over the years, there
are only a handful of fundamentally different types of tasks these algorithms address.

1. Classification and class probability estimation attempts to predicts, for each individual in a
population, which of a (small) set of classes this individual belongs to. Usually the classes re
mutually exclusive. For example, ‘Among all the customers of MegaTelCo, which are likely to
respond to a given offer?’ Two classes could be called will respond and will not respond.
A data mining procedure produces a model that, given a new individual, determines which
class that individual belongs to à class probability estimation. A scoring model applied to an
individual produce, instead of class prediction, a score representing the probability that that
individual belongs to each class. (produce a score of how likely each Is to respond to the
offer).

2. Regression (value estimation) attempts to estimate or predict, for each individual the
numerical value of some variable for that individual. An example, ‘How much will a given
customer use the service?’. The variable to be predicted here is service usage and a model
could be generated by looking at other, similar individuals in the population and their
historical usage.
Regression is related to classification, but the two are different. Classification predicts
whether something will happen, whereas regression predicts how much something will
happen.

3. Similarity matching attempts to identify similar individuals based on data known about
them. Can be used directly to find similar entities. For example, IBM is interested in finding
companies similar to their best business customers, in order to focus their sales on the best
opportunities. They use similarity matching based on ‘firmographic’ data describing
characteristics of the companies. It is the best based for one of the most popular methods for
making products recommendations.

4. Clustering attempts to group individuals in a population together by their similarity, but not
driven by any specific purpose. An example, ‘Do our customers form natural groups or
segments?’ Clustering is useful in preliminary domain exploration to see which natural groups
exist because these groups in turn may suggest other data mining tasks or approaches.

5. Co-occurrence grouping (market-basket analysis) attempts to find associations between
entities based on transactions involving them. For example, ‘What items are commonly
purchased together?’ Co-occurrence considers similarity of objects based on their appearing

4

, together in transaction, it could suggest special promotion, product display or combination
offers.

6. Profiling (behavior description) attempts to characterize the typical behavior of an
individual, group, or population. An example, ‘What is the typical cell phone usage of this
customer segment?’ Behavior can be described generally over an entire population, or down
to the level of small groups or even individuals.
Profiling is often used to establish behavioral norms for anomaly detection applications such
as fraud detection. For example, if we know what kind of purchased a person typically makes
on a credit card, we can determine whether a new charge on the card fits that profile or not.

7. Link prediction attempts to predict connections between data items, usually by suggesting
that a link should exist, and possibly also estimating the strength of the link. For example,
‘Since you and Karen share 10 friends, maybe you’d lie to be Karen’s friend?’

8. Data reduction attempts to take a large set of data and replace it with a smaller set of data
that contains much of the important information in the larger set. The smaller data set may
be easier to deal with or to process.

9. Causal modeling attempts to help us understand what events or actions actually influence
others. For example, we observe that indeed the targeted consumers purchase at a higher
rate subsequent to having been targeted. Was this because the advertisements influenced
the consumers to purchase?

Supervised vs. Unsupervised methods
Unsupervised –
For example, ‘Do our customers naturally fall into different groups?’ Here no specific purpose
or target has been specified for the grouping. When there is no specific target, the data
mining is unsupervised.
Clustering, co-occurrence grouping and profiling are solved with unsupervised data mining.

Supervised –
For example, ‘Can we find groups of customers who have particularly high likelihoods of
canceling their service soon after their contracts expire?’ Here is a specific target defined: will
a customer leave when her contract expires. Segmentation is being done for a specific reason:
to act based on the likelihood of churn.
Important for supervised data mining is that there must be data on the target. Acquiring data
on the target often is a key data science investment.
Classification, regression, causal modelling, similarity marching, link prediction and data
reduction are solved with supervised data mining.

5

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper tessasnijders. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 52510 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Summary Strategy Analytics (Data Science for Business)

Document informatie

Onderwerpen

Gekoppeld boek

Meer samenvattingen voor studieboek

Geschreven voor

3 beoordelingen

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?