100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Machine Learning for the Quantified Self - Book Summary per Chapter €16,99
In winkelwagen

Samenvatting

Machine Learning for the Quantified Self - Book Summary per Chapter

 2 keer bekeken  0 keer verkocht

A summary per chapter for the book "Machine Learning for the Quantified Self: On the Art of Learning from Sensory Data" by Mark Hoogendoorn and Burkhardt Funk.

Voorbeeld 4 van de 47  pagina's

  • Ja
  • 30 december 2024
  • 47
  • 2022/2023
  • Samenvatting
book image

Titel boek:

Auteur(s):

  • Uitgave:
  • ISBN:
  • Druk:
Alles voor dit studieboek (2)
Alle documenten voor dit vak (3)
avatar-seller
tararoopram
Machine Learning for the Quantified Self - Book Summary


Chapter 1: Introduction
1.1 The Quantified Self
The quantified self is any individual engaged in the self-tracking of any kind of biological,
physical, behavioral, or environmental information. The selftracking is driven by a certain goal of
the individual with a desire to act upon the collected information.




What drives quantified selves to gather information? Three broad categories:
● Improve health (e.g. cure or manage a condition, achieve a goal, execute a treatment
plan)
● Enhance other aspects of life (maximize work performance, be mindful)
● Find new life experiences (e.g. learn to increasingly enjoy activities, learn new things).

Five-Factor-Framework of Self-Tracking Motivations
● Self-healing (help yourself to become healthy)
● Self-discipline (like the rewarding aspects of the quantified self)
● Self-design (control and optimize yourself using the data)
● Self-association (enjoying being part of a community and to relate yourself to the
community)
● Self-entertainment (enjoying the entertainment value of the self-tracking)

Since self-tracking data can be misused or used in a way that is not fully in the interest of a
person, it is not surprising that users state the loss of privacy as their main concern in this
context.

1.2 The Goal of this Book
Machine learning is to automatically identify patterns from data. Specifically, to automatically
extract patterns from collected data and to enable a user to act upon insights effectively, which
in turn contributes to the goal of the user.




1

,Machine Learning for the Quantified Self - Book Summary


Unique characteristics of machine learning in the quantified self context
● Sensory data is noisy
● Many measurements are missing
● The data has a highly temporal nature
● Algorithms should enable the support of and interaction with users without a long
learning period
● We collect multiple datasets (one per user) and can learn across them

1.3 Basic Terminology
A measurement is one value for an attribute recorded at a specific time point. They can be
numerical, or categorical with an ordering (ordinal) or without (nominal). Measurements
frequently come in sequences, what we call time series. A time series is a series of
measurements in temporal order

Machine learning is commonly divided into four types of learning problems
● Supervised learning: the machine learning task of inferring a function from labeled
training data
● Unsupervised learning: there is no target measure (or label), and the goal is to
describe the associations and patterns among the attributes
● Semi-supervised learning: a technique to learn patterns in the form of a function based
on labeled and unlabeled training examples
● Reinforcement learning: tries to find optimal actions in a given situation so as to
maximize a numerical reward that does not immediately come with the action but later in
time. The learner is not told which actions to take as in supervised learning but instead
must discover which actions yield the highest reward over time by trying them.

1.4 Basic Mathematical Notation




2

,Machine Learning for the Quantified Self - Book Summary


1.5 Overview of the Book




Chapter 2: Basics of Sensory Data
2.1 Crowdsignals Dataset
There exists a huge variety of sensors. Popular (smartphone) sensors:
● Accelerometer: measures the changes in forces upon the phone on the x, y, z-plane
● Gyroscope: measures the orientation of the phone compared to the “down” direction (the
earth’s surface) and the angular velocity
● Magnetometer: measures the x-, y-, and z-orientation relative to the earth’s magnetic
field
● GPS signal: measures your position by means of your distance to a number of satellites
of which the position is known

2.2 Converting the Raw Data to an Aggregated Data Format
In order to convert the temporal data, we first need to determine the time step size we are going
to use in our dataset. This is also referred to as the level of granularity (selecting a ∆t). The
selection of the step size depends on a variety of factors, including the task, the noise level, the
available memory and cost of storage, the available computational resources for the machine
learning process, etc. Once we have selected this step size we can create an empty dataset.

We start with the earliest time point observed in our crowdsignals measurements and generate
a first row xtstart . Iteratively, we create additional rows for the following time steps by taking the
previous time step and adding our step size, e.g. xtstart+∆t .


3

, Machine Learning for the Quantified Self - Book Summary


Each row xt represents a summary of the values encountered in the interval defined by the time
step it was created for until the next time step. We continue until we have reached the last time
step in our dataset. Next, we should identify the columns in our dataset (our attributes) that we
want to aggregate. For the numerical values (e.g., heart rate), we create a single column for
each variable we measure while for the categorical values we create a separate column for
each possible value.

Once we have defined the entire empty dataset, we are ready to derive the values for each
attribute at each discrete time step (i.e. each row). We can aggregate numerical values by
averaging the relevant measurements or we can sum them up (e.g. when the measurements
concern a quantity) or use other descriptive metrics from statistics such as median or variance.
For categorical values we can count whether at least one measurements of that value has been
found in the interval (binary) or we can count the number of measurements that have been
found for the value (sum).

2.4 Machine Learning Tasks
Focusing on supervised learning we define two tasks:
● A classification problem, namely predicting the label (i.e. activity) based on the sensors
● A regression problem, namely predicting the heart rate based on the other sensory
values and the activity

Chapter 3: Handling Noise and Missing Values in Sensory Data
Three approaches for handling noise:
1. Detect and remove outliers from our data
2. Impute missing values in our data (could also have been outliers that were removed)
3. Transform our data to identify most important parts

3.1 Detecting Outliers
An outlier is an observation point that is distant from other observations. There are two types:
● Those caused by a measurement error, which may be removed based on
○ domain knowledge
○ visual inspection
○ trying whether we improve on our machine learning tasks when we remove them
● Those simply caused by variability of the phenomenon that we observe or measure

Distribution-Based Models → outlier removal is based on the probability distribution of the data
● Chauvenets criterion: identify values for an attribute that are
unlikely given a single normal distribution N(μ, σ2) to
describe the data
○ Given that we have N measurements for attribute Xj,
we compute the mean μ and standard deviation σ of
our data:




4

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper tararoopram. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €16,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 48298 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen
€16,99
  • (0)
In winkelwagen
Toegevoegd