ba intro to data science introduction vu free university
Written for
Vrije Universiteit Amsterdam (VU)
Business Analytics
Introduction to Data Science (XB_0018)
All documents for this subject (4)
Seller
Follow
berendmarkhorst
Reviews received
Content preview
Decision Trees
Question Don start at the
top and
a
single versie
✓ ↳ ← work done until
jour wagfurther
classification other classification rood cannot clown
join go
.
0
→
" ode
based the note it be
Ík
on : can
answer
categoriaal or ↳ With namen data ,
sort the
namen .
§ 0 ^
0
I I
0
values and complete the
Question Which feature should Giri for
§ % / Imparty
" "
:
/ ↳ every average
be at the rootnode ?
,
0 0 between the
y y / res .
Based on
~ these values
→ We look kon wek
the data
they
Gin
leef modes
off
join
choose the
separate : at -
port .
We choose the stop when the Gin
Impurity We
Impuritg
>
any
elk
with lowest
the
leef contained
value
a
.
mixture
dus mot
it becomes
decreet
a leef
anymore
node .
Then
.
,
of patient with and without
keert disease .
2) Create a
bootstappeel dataset .
Random Forest Alter
running
he data
↳ can have the same random forest are made at of clown alt of the frees in the
Join
multiple times Het decision frees These drees random forest which
entry
in
eng
.
are we seen
,
to
dataset . use
,
but in
practica they are
option receive nare vetes .
Create decision free the not that
2) a
using awesome →
inaccuraat .
bootstappeel dataset but
random subset of variabele
,
only use
They
data but
work
grot
om
not Hedde
training
when
Baggio
data
bootstapping He
the
=
a
,
are
plus using aggregatie
(or columns) it to decision
classiknewHe
at eed to make
stap .
comes a
.
3) 60 bad to sleep s and repeat .
samples RFS . combine
%
This results wider of
Simplicity of D. Is with
Herbig .
Typicadataabout
doeg
,
at the
net eind
in a
variety origineel up
frees This what The of Ont of
.
is makes random proportie
- -
Bag in the bootstappen dataset .
forest none effective than individual samples that were incorrect t
classilied is the Ont of Oct of Dataset
decision frees
Bag Bag
- - - -
.
Error .
↳ Build Random Forest
]
a
Estimated
accuraat ' "
dage the number of
of a Random Forest .
variabele used per sleep .
. .
Typica the start by using the ,
we
Messing data gets predikend
Square
of number of variaties
bij ander Random Forest
the
,
and then
above and below Het rake
try a few
settings
based
Matrix
on
. Jon
Provily
start with
.
random ( the )
e.g. average
a
gras
Disdance Matrix : and repeat this 6 or 7 times , until
distance the
1-
proxima valies =
missingvah.es convoy
.
↳ we can draw a heetmap !
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller berendmarkhorst. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for £2.56. You're not tied to anything after your purchase.