Summary

Clear summary including images

0 purchase

Course
Strategy Analytics

Institution
Tilburg University (UVT)

Book
Data Science for Business

The summary has been worked on weekly during the study period and therefore all possible theory is included.

[Show more]

Preview 3 out of 21 pages

View example

Summarized whole book? No
Which chapters are summarized? 1-14
Uploaded on December 15, 2021
Number of pages 21
Written in 2021/2022
Type Summary

strategy
analytics

Book Title:Data Science for Business

Author(s):Foster Provost, Tom Fawcett

Edition:Unknown
ISBN:9781449374280
Edition:1

Exam (elaborations)
Data Mining for Data Science and Analytics - New England College - Quiz 14
Exam (elaborations)
Data Mining for Data Science and Analytics - New England College - Quiz 13
Exam (elaborations)
Data Mining for Data Science and Analytics - New England College - Quiz 12

Institution
Tilburg University (UVT)
Education
MSc. Strategic Management
Course
Strategy Analytics

$10.56

Add to cart

Save

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Strategy Analytics

Chapter 1. Introduction: Data-Analytic Thinking

Information is now widely available on external events such as market trends, industry news, and
competitors’ movements. This broad availability of data has led to increasing interest in methods for
extracting useful information and knowledge from data—the realm of data science

Probably the widest applications of data-mining techniques are in marketing for tasks such as
targeted marketing, online advertising, and recommendations for cross-selling.

Data mining is used for general customer relationship management to analyze customer behavior in
order to manage attrition and maximize expected customer value.

At a high level, data science is a set of fundamental principles that guide the extraction of knowledge
from data. Data mining is the extraction of knowledge from data, via technologies that incorporate
these principles.

Example Hurricane

It would be more valuable to discover patterns due to the
hurricane that were not obvious. To do this, analysts might
examine the huge volume of Wal-Mart data from prior, similar
situations (such as Hurricane Charley) to identify unusual local
demand for products. From such patterns, the company might be
able to anticipate unusual demand for products and rush stock to
the stores ahead of the hurricane’s landfall.

They show that statistically, the more data- driven a firm is, the more productive it is—even
controlling for a wide range of possible confounding factors. And the differences are not small. One
standard deviation higher on the DDD scale is associated with a 4%–6% increase in productivity. DDD
also is correlated with higher return on assets, return on equity, asset utilization, and market value,
and the relationship seems to be causal.

The sort of decisions we will be interested in in this book mainly fall into two types: (1) decisions for
which “discoveries” need to be made within data, and (2) decisions that repeat, especially at massive
scale, and so decision-making can benefit from even small increases in decision-making accuracy
based on data analysis

Predictive model abstracts away most of the complexity of the world, focusing in on a particular set
of indicators that correlate in some way with a quantity of interest (who will churn, or who will
purchase, who is pregnant, etc.).

Big data essentially means datasets that are too large for traditional data processing systems, and
therefore require new processing technol‐ ogies.

Occasionally, big data technologies are actually used for implementing data mining techniques.
However, much more often the well-known big data technologies are used for data processing in
support of the data mining techniques and other data science activities,

,Big Data 1.0: Firms are busying themselves with building the capabilities to process large data, largely
in support of their current operations—for example, to improve efficiency.

In Web 1.0, businesses busied themselves with getting the basic internet technologies in place, so
that they could establish a web presence, build electronic commerce capability, and improve the
efficiency of their op‐ erations.

Web 2.0, where new systems and companies began taking advantage of the interactive nature of the
Web.

Big Data 2.0: Once firms have become capable of processing massive data in a flexible fashion, they
should begin asking: “What can I now do that I couldn’t do before, or do better than I could do
before?” This is likely to be the golden era of data science.

The prior sections suggest one of the fundamental principles of data science: data, and the capability
to extract useful knowledge from data, should be regarded as key strategic assets.

thinking of these as assets should lead us to the realization that they are complementary.

Sociodemographic data provide a substantial ability to model the sort of consumers that are more
likely to purchase one product or another. (The case in Capital one).

Fundamental concept: Extracting useful knowledge from data to solve business problems can be
treated systematically by following a process with reasonably well-defined stages. The Cross Industry
Standard Process for Data Mining, abbreviated CRISP-DM (CRISP- DM Project, 2000), is one
codification of this process. Keeping such a process in mind provides a framework to structure our
thinking about data analytics problems

Fundamental concept: From a large mass of data, information technology can be used to find
informative descriptive attributes of entities of interest

Fundamental concept: If you look too hard at a set of data, you will find something—but it might not
generalize beyond the data you’re looking at. This is referred to as overfit‐ ting a dataset.

Fundamental concept: Formulating data mining solutions and evaluating the results involves thinking
carefully about the context in which they will be used.

Chapter 2. Business Problems and Data Science Solutions

Data mining is a process with fairly well- understood stages.

A critical skill in data science is the ability to decompose a data- analytics problem into pieces such
that each piece matches a known task for which tools are available.

Tasks:

1. Classification and class probability estimation attempt to predict, for each individual in a
population, which of a (small) set of classes this individual belongs to. Usually, the classes are
mutually exclusive. An example classification question would be: “Among all the customers
of MegaTelCo, which are likely to respond to a given offer?” In this example the two classes
could be called will respond and will not respond. (Whether something will happen).

, 2. Regression (“value estimation”) attempts to estimate or predict, for each individual, the
numerical value of some variable for that individual. An example regression question would
be: “How much will a given customer use the service?” The property (variable) to be
predicted here is service usage. (How much something will happen).
3. Similarity matching attempts to identify similar individuals based on data known about them.
Similarity matching can be used directly to find similar entities.
4. Clustering attempts to group individuals in a population together by their similarity, but not
driven by any specific purpose. An example clustering question would be: “Do our customers
form natural groups or segments?”
5. Co-occurrence grouping (also known as frequent itemset mining, association rule discovery,
and market-basket analysis) attempts to find associations between enti‐ ties based on
transactions involving them.
6. Profiling (also known as behavior description) attempts to characterize the typical behavior
of an individual, group, or population. An example profiling question would be: “What is the
typical cell phone usage of this customer segment?”
7. Link prediction attempts to predict connections between data items, usually by suggesting
that a link should exist, and possibly also estimating the strength of the link. Link prediction is
common in social networking systems: “Since you and Ka ‐ ren share 10 friends, maybe you’d
like to be Karen’s friend?”
8. Data reduction attempts to take a large set of data and replace it with a smaller set of data
that contains much of the important information in the larger set. The smaller dataset may
be easier to deal with or to process.
9. Causal modeling attempts to help us understand what events or actions actually influence
others. For example, consider that we use predictive modeling to target advertisements to
consumers, and we observe that indeed the targeted consumers purchase at a higher rate
subsequent to having been targeted. Was this because the advertisements influenced the
consumers to purchase?

Supervised vs. unsupervised methods

When there is no target, the data mining problem is referred to as unsupervised.

The learner would be given no information about the purpose of the learning but would be left to
form its own conclusions about what the examples have in common.

A supervised technique is given a specific purpose for the grouping—predicting the target. Clustering,
an unsupervised task, produces groupings based on similarities, but there is no guarantee that these
similarities are meaningful or will be useful for any particular purpose.

The value for the target variable for an individual is often called the indi ‐ vidual’s label, emphasizing
that often (not always) one must incur expense to actively label the data.

Classification, regression, and causal modeling generally are solved with supervised methods.
Similarity matching, link prediction, and data reduction could be either. Clustering, cooccurrence
grouping, and profiling generally are unsupervised.

Important distinction pertaining to mining data: the difference between (1) mining the data to find
patterns and build models, and (2) using the results of data mining.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller CFMdejong. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $10.56. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

64670 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling

Seller

Exam (elaborations) ·

Exam (elaborations) ·

Package deal ·

Exam (elaborations) ·

Summary ·

Answers ·

Exam (elaborations) ·

Exam (elaborations) ·

Summary ·

Summary

Clear summary including images

Document information

Subjects

Connected book

More summaries for

Written for

Seller

Content preview

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Quick and easy check-out

Focus on what matters

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?

Recently viewed by you

Exam (elaborations) ·

DIMENSIONS 2058 TEST 2 REVIEW.Questions and Answers

Exam (elaborations) ·

Complete Health Assessment Exam 2 Study Guide.

Package deal ·

Pathofysiologie 2 (klinische praktijk cursus)

Exam (elaborations) ·

Rel 100 Final questions well answered rated A+

Summary ·

BUS2018F Exam Notes: Employee Relations Section

Answers ·

ENG1502 ASSIGNMENT 3 SOLUTIONS 2020 SEMESTER 2

Exam (elaborations) ·

Unit 1 Challenge 2 - Introduction to Sociology (Sophia)

Exam (elaborations) ·

NURS 5344 final comprehensive collaborative Updated

Summary ·

How to approach texts through a feminist lens