L1: Introduction
ML is to automatically learn patterns form data. A computer program is set to learn from
experience E with respect to some class of tasks T and performance P, if its performance at
tasks in T improves with E.
KR: Representing information about the world in a form that a computer can use and knowledge
about solving complex tasks.
Characteristics of medical data: Sparsity (lot of zeros/NAs; sparse data requires a lot more
instances), Missing values (result in more noise, potential biases in models), Lot of assumptions
needed to process the data (e.g., diagnostic code not provided means the patient did not have
the disease; when they do not hold the learned models can be inaccurate)
Missing values
● Because often not continuously measured, often measured for a reason and registration
itself can be poor
● Types: MCAR (missing data is independent of the observed and unobserved data), MAR
(missing data is systematically related to the observed but not the unobserved data),
MNAR (missing data is systematically related to the unobserved data, i.e., related to
events or factors which are not measured by the researcher → highest bias risk)
Injecting (medical) knowledge into machine learning:
● Feature selection
○ Learning theory says
○ Assume we have M hypotheses, then:
○ Basically, if given enough training examples we can approximate the
out-of-sample error arbitrarily well by the in-sample error. The statement comes
with a probabilistic caveat, it only holds with a certain probability 1−δ > 0.
However, this probability can be chosen to be arbitrarily small. Note that every
finite hypothesis set is PAC learnable.
○ The more features, the more hypotheses we will need, and the more instances
we will need
○ Feature selection (reducing the number of features) can reduce model complexity
(i.e., for ERF, measure explanainity with number of terms in model/average rule
length of DT/proportion of expert knowledge terms in DT)
● Feature abstraction: we reduce model complexity by creating features at a higher level of
abstraction (e.g., using coding structures/ontologies)
1
, ● Missing value handling → use closed world assumption (CWA): absence of a fact means
it is necessarily false; knowledge can help determine whether cwa is correct
● ML techniques that embed knowledge (ExpertRuleFit)
Module 1: Machine Learning with Prior Knowledge
L2: Design Patterns & Expert Rule Fit
There are two AI systems
● System 1: thinking fast (data driven) → statistical
○ Scalable: worse with less data → “sample inefficiency”
○ Not explainale: blackbox
● System 2: thinking slow (knowlegde driven) → symbolic
○ Scalable: worse with more data → “combinational explosion”
○ Explainable because of rules
Make patterns by coupling elementary components
● Informed learning with prior knowledge (to improve performance of a learning algorithm)
○ Use of symbolic reasoner to improve the performance of a subsequent learning
algorithm
○ Use knowledge for data abstraction
○ Use knowledge for data completion → compensate for incompleteness of data
(Expert) RuleFit
● RuleFit: Rule-based ML ensemble method,
that combines several base models (e.g.,
boosing methods) in order to produce one
optimal predictive model → classification
model
○ Ensemble learning takes the form of a generalized, linear model
○ Two stages of RuleFit
i. Ensemble generation
2
, ● Tree ensemble generation via stochastic gradient boosting,
Conversion of trees into rules, Rule Cleaning, Inclusion of linear
terms
● Resulting ensemble consists of rules + linear terms
ii. Lasso-regularized regression
● Learning the weight coefficients by solving an optimization
problem; we regularize/penalize weight coefficients 𝛼’s and 𝛽’s to
melt down the initial set of rules to the truly informative ones.
○ Removal of uninformative rules and linear terms
● Rulefit strengths/ limitations
○ Strengths: comparatively high interpretability, potential for knowledge discovery,
state-of-the-art predictive performance
○ Limitations: training data dependency, limited expert acceptance
● Three stages of Expert RuleFit
1. Expert Knowledge Acquisition
2. Combined (Rule) Ensemble Generation
3. Expert Knowledge-Aware Regularization
○ Adaptive lasso with customised penalty values v for expert knowledge
● Aims Expert RuleFit: combine the strengths of ML and Knowledge representation
○ Exploiting the “learning ability” from the data
○ Exploiting the knowledge that we already have (we extend our knowledge with
learning of data)
○ Adding knowledge for “explainability” (rules from experts are more acceptable for
experts)
■ Goal: improve performance and increase human trust in model results
(explainability)
Some methods/models
● Boosting: iteratively training a base classifier on re-weighted versions of the training data
then employing the weighted sum of estimates of the trained sequence of classifiers. At
each iteration: weights are increased for those cases that have been most frequently
misclassified and weights are decreased for correctly classified cases; emphasis on
instances that are difficult to predict.
3
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper tararoopram. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €9,99. Je zit daarna nergens aan vast.