Samenvatting

Samenvatting Data Science Methods for MADS (EBM216A05)

Beoordeling

Verkocht

Pagina's

Geüpload op

05-02-2021

Geschreven in

2020/2021

Uitgebreide samenvatting voor het vak Data Science Methods for MADS. Zowel de colleges en verplichte literatuur staan in de samenvatting. Ook aanvullende informatie is gebruikt om bepaalde concepten beter te snappen.

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Geschreven voor

Instelling: Rijksuniversiteit Groningen (RuG)
Studie: Marketing Analytics & Data Science
Vak: Data Science Methods for MADS (EBM216A05)

Alle documenten voor dit vak (7)

Documentinformatie

Geüpload op: 5 februari 2021
Aantal pagina's: 34
Geschreven in: 2020/2021
Type: Samenvatting

Onderwerpen

Voorbeeld van de inhoud

DATA SCIENCE METHODS

1. MACHINE LEARNING
Machine learning
Herbert Alexander Simon: “Learning is any process by which a system improves performance from
experience. Machine learning is concerned with computer programs that automatically improve their
performance through experience”
 Machines learning
o Learns the data
o Build predictions of new data based on past data
o Improve performance through experience
o More past data > better model > higher accuracy of current outcomes

 Many ways in which the machine learns
o Supervised learning
 Uses labeled data to train the model
 Dependent techniques
 Correct results are known and given as an input to the model during the learning
process
 Seeks to learn from a training set to predict the output when given an input, less
concerned with the “true” linkage between variables.
 Methods: Logistic regression, neural networks, SVM, K-nearest Neighbour, Naïve
Bayes, Decision Tree and Artificial Neural Networks
 Ensemble methods: meta-learning algorithms that combine multiple
individual learners (e.g. random forest, gradient boosted trees, XGBoost).
 Probabilistic graphical models: use (un)directed graphs to encode the
conditional dependence of the random variable (e.g. Bayesian networks,
Markov random fields)
 Deep neutral networks: artificial neural networks with more than one
hidden layer (e.g. convolutional neutral networks, recurrent neural
networks).

o Unsupervised learning
 Clustering data based on similar characteristics
 No labeled data
 Interdependent techniques
 Goal is to find hidden patterns in the data.
 Clustering, factor analysis
 Methods: clustering (K-means, hierarchical, DBSCAN), dimensionality reduction
(PCA, singular value decomposition, factor analysis).
 Topic models: discover and extract semantic structures from textual data
(e.g. Latent Dirichlet Allocation (LDA))
 Representation learning: allows a system to automatically discover the
representation needed for feature detection or classification from raw data
(e.g. autoencoder, word embedding, network embedding)

, o Reinforcement learning
 Based on feedback; learn from feedback given
 Feedback is given to output and put back into the machine learning model
 Thus, the learning agent interacts with the environment by taking actions and
observing feedback in order to optimize a certain objective function
 Methods: multi-earned bandit, dynamic programming, sarsa, n-step temporal
difference, deep Q network.

o Ma & Sun (2019) name some other machine learning methods
 Semi-supervised learning = the output is known for only a subset of data
 Transfer learning = adjusting an existing model, which is trained using a different
dataset for a different purpose, based on the current training data set, for the task
at hand
 Active learning = limited training instances available at first, can be acquired by
algorithm to improve predictive accuracy but determining the most important
training instances is costly

 Input of data  Machine learning model  Output according to algorithm applied

 Iteration = a term used in machine learning that indicates the number of times the algorithm’s
parameters are updated. Any machine learning is composed of multiple iterations. It does
something, checks if it is right, does something again. 100 iterations = 100 re-tries in which the
machine corrects errors

Difference between data mining and machine learning
Data mining
 The process of discovering patterns in a data set
 Before machine learning
 Perform data mining by using programming methods and algorithms
 Helps to extract useful data from large amounts of raw data
 Helps us to understand the data and make it usable
 Involves manual efforts to find knowledge and insights in data
 Part of the Knowledge Discovery in Databases (KDD) process
o Non-trivial process of identifying implicit, valid, novel, potentially useful, and
understandable patterns in data
o Data base  Data warehouse  Data mining  Evaluation  Knowledge

Machine learning
 Techniques to make computers learn new things without explicitly programming
 Based on pattern recognition, computational learning and artificial intelligence
 Main uses of machine learning are predictive analysis and classification
 Algorithms / models train the system to identify patterns / learn about new insights

,Strengths and weaknesses of machine learning methods
Strengths Weaknesses
Ability to handle unstructured data (e.g. texts, images) Not easy to interpret but
and data of hybrid formats (e.g. combination of texts, 1) many ML methods have statistical foundations with
images). interpretable parameters
2) post-hoc interpretation techniques exist and
3) models have been adapted for interpretation.
Ability to handle large data volume: millions of Relationship typically correlational instead of causal:
observations are the norm. predictive focus causes little focus on endogeneity.
Flexible model structure: increases the chance of Unproven on analysing individual consumer level
capturing true linkage between input and output variables. heterogeneity and dynamics
Strong predictive performance in real-world settings.

Over-fitting and under-fitting in machine learning
Overfitting = good performance on the training data, poor generalization to other data
 A overfitted model has too many parameters to be justified by the actual underlying data and
therefore build an overly complex model.

The model function has too much complexity (parameters) to fit the true
function correctly.

Underfitting = poor performance on the training data and poor generalization to other data
 A underfitted model has not enough parameters to capture the trends in the underlying system.

The model function does not have enough complexity (parameters) to fit
the true function correctly

Artificial Intelligence (AI) vs. machine learning vs.
deep learning
Artificial intelligence = Automated systems that make
split-second context-dependent decisions. Generally
implemented using machine learning algorithms. It is
about making the machine behave in ways that would be
called intelligent if a human were behaving like that. This
term is often used to describe machines that mimic

, “cognitive” functions that humans associate with the human mind such as learning and problem
solving.
Machine learning = a computer program is said to learn from experience E with respect to some class
of tasks T and performance measure P.
Deep learning = representation-learning methods with multiple levels of representation. Obtained by
composing simple but non-linear modules that each transform the representation at one level into a
representation at a higher, slightly more abstract level. The adjective “deep” in deep learning comes
from the use of multiple layers in the network. Deep learning is concerned with an unbounded number
of (hidden) layers which permits practical application and optimized implementation.

AI-driven marketing industry trends
Marketing trends
 Interactive and media-rich: ML can generate insights in the (mobile) interactions between
consumers and the firm
 Personalization and targeting: ML method are propelling personalization and targeting to a
new level. ML assists in context-dependent targeting
 Real-time optimization and automation: ML methods are the go-to solutions for
optimization and automation
 Customer journey focus: ML can help firms to master the decision journey

Marketing practices
 Customer engagement: AI-driven innovations are rapidly reshaping engagement practices
 Search: ML can improve the relevance and robustness of search results
 Recommendation: ML is used to effectively match products and consumers
 Attribution: ML is able to generate accurate performance feedback and can help improve
channel design and allocation

Review of machine learning literature in marketing (Ma & Sun, 2020)
SVM: one of the first ML methods introduced to marketing. SVM predicts better than logit models,
logistic regression and hierarchical bayes. Overwhelms traditional methods.

Traditional text-mining: text-mining process includes downloading, cleaning, information extraction,
chunking, and identification of semantic relationships. Can help firms to identify response-worthy
reviews.

Topic models: can be used to identify topics from consumer search queries and webpages, can
also show that topics from those two sources are related. Have been not only applied to text, but also
to other marketing settings where semantic structure exists (e.g. predicting purchases, user profiling).

Deep learning: most frequently used in marketing for analyzing text and images. Was used to
evaluate feature importance in predicting conversion. Was used to investigate the impact of images on
demand. Was used in a study proving that photos are more predictive in restaurant survival than
reviews. Was used to extract image features to predict person’s attractiveness.

Tree ensembles: was used to show that personalization improved the clicks to the top position and
that the return to personalization varied with user history and query type.

Causal forest: recent advancements have made it possible to use ML for causal research. Was used to
investigate how information disclosure affects pharmaceutical companies’ payments to physicians.

€6,49

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

nikkinuman

3,9

(300)

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

nikkinuman Rijksuniversiteit Groningen

Bekijk profiel

Volgen

Verkocht

532

Lid sinds

10 jaar

Aantal volgers

363

Documenten

Laatst verkocht

9 maanden geleden

Cum laude afgestudeerde hotello & marketing student RUG

Door al mijn zelfgemaakte samenvattingen, verslagen en handige documenten ben ik cum laude afgestudeerd aan ''International Hospitality Management''. Daarnaast heb ik de Pre-Master Marketing afgerond aan de RUG zonder herkansingen en ben ik begonnen met de master MADS (Marketing Analytics & Data Science) aan de RUG Daarnaast ben ik een nieuwe weg in geslagen door de master Marketing te volgen, ook deze samenvattingen deel ik graag met jullie.

Lees meer Lees minder

3,9

300 beoordelingen

137

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper nikkinuman. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 40945 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

Samenvatting Data Science Methods for MADS (EBM216A05)

Geschreven voor

Documentinformatie

Onderwerpen

Voorbeeld van de inhoud

Meer vakken binnen Rijksuniversiteit Groningen (RuG) > Marketing Analytics & Data Science

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?