ISYE 6501 Midterm 1 Exam Questions &
Answers
Rows - ANSWER-Data points are values in data tables
Columns - ANSWER-The 'answer' for each data point (response/outcome)
Structured Data - ANSWER-Quantitative, Categorical, Binary, Unrelated, Time Series
Unstructured Data - ANSWER-Text
Support Vector Model - ANSWER-Supervised machine learning algorithm used for both classification and
regression challenges.
Mostly used in classification problems by plotting each data item as a point in n-dimensional space (n is
the number of features you have) with the value of each feature being the value of a particular
coordinate.
Then you classify by finding a hyperplane that differentiates the 2 classes very well. Support vectors are
simply the coordinates of individual observation -- it best segregates the two classes (hyperplane / line).
What do you want to find with a SVM model? - ANSWER-Find values of a0, a1,...,up to am that classifies
the points correctly and has the maximum gap or margin between the parallel lines.
What should the sum of the green points in a SVM model be? - ANSWER-The sum of green points should
be greater than or equal to 1
What should the sum of the red points in a SVM model be? - ANSWER-The sum of red points should be
less than or equal to -1
, What should the total sum of green and red points be? - ANSWER-The total sum of all green and red
points should be equal to or greater than 1 because yj is 1 for green and -1 for red.
First principal component - ANSWER-PCA -- a linear combination of original predictor variables which
captures the maximum variance in the data set. It determines the direction of highest variability in the
data. Larger the variability captured in first component, larger the information captured by component.
No other component can have variability higher than first principal component.
it minimizes the sum of squared distance between a data point and the line.
Second principal component - ANSWER-PCA -- also a linear combination of original predictors which
captures the remaining variance in the data set and is uncorrelated with Z¹. In other words, the
correlation between first and second component should is zero.
What if it's not possible to separate green and red points in a SVM model? - ANSWER-Utilize a soft
classifier -- In a soft classification context, we might add an extra multiplier for each type of error with a
larger penalty, the less we want to accept mis-classifying that type of point.
Soft Classifier - ANSWER-Account for errors in SVM classification. Trading off minimizing errors we make
and maximizing the margin.
To trade off between them, we pick a lambda value and minimize a combination of error and margin. As
lambda gets large, this term gets large.
The importance of a large margin outweighs avoiding mistakes and classifying known data points.
Should you scale your data in a SVM model? - ANSWER-Yes, so the orders of magnitude are
approximately the same.
Data must be in bounded range.
Common scaling: data between 0 and 1
a. Scale factor by factor
b. Linearly
How should you find which coefficients to hold value in a SVM model? - ANSWER-If there is a coefficient
who's value is very close to 0, means the corresponding attribute is probably not relevant for
classification.
Answers
Rows - ANSWER-Data points are values in data tables
Columns - ANSWER-The 'answer' for each data point (response/outcome)
Structured Data - ANSWER-Quantitative, Categorical, Binary, Unrelated, Time Series
Unstructured Data - ANSWER-Text
Support Vector Model - ANSWER-Supervised machine learning algorithm used for both classification and
regression challenges.
Mostly used in classification problems by plotting each data item as a point in n-dimensional space (n is
the number of features you have) with the value of each feature being the value of a particular
coordinate.
Then you classify by finding a hyperplane that differentiates the 2 classes very well. Support vectors are
simply the coordinates of individual observation -- it best segregates the two classes (hyperplane / line).
What do you want to find with a SVM model? - ANSWER-Find values of a0, a1,...,up to am that classifies
the points correctly and has the maximum gap or margin between the parallel lines.
What should the sum of the green points in a SVM model be? - ANSWER-The sum of green points should
be greater than or equal to 1
What should the sum of the red points in a SVM model be? - ANSWER-The sum of red points should be
less than or equal to -1
, What should the total sum of green and red points be? - ANSWER-The total sum of all green and red
points should be equal to or greater than 1 because yj is 1 for green and -1 for red.
First principal component - ANSWER-PCA -- a linear combination of original predictor variables which
captures the maximum variance in the data set. It determines the direction of highest variability in the
data. Larger the variability captured in first component, larger the information captured by component.
No other component can have variability higher than first principal component.
it minimizes the sum of squared distance between a data point and the line.
Second principal component - ANSWER-PCA -- also a linear combination of original predictors which
captures the remaining variance in the data set and is uncorrelated with Z¹. In other words, the
correlation between first and second component should is zero.
What if it's not possible to separate green and red points in a SVM model? - ANSWER-Utilize a soft
classifier -- In a soft classification context, we might add an extra multiplier for each type of error with a
larger penalty, the less we want to accept mis-classifying that type of point.
Soft Classifier - ANSWER-Account for errors in SVM classification. Trading off minimizing errors we make
and maximizing the margin.
To trade off between them, we pick a lambda value and minimize a combination of error and margin. As
lambda gets large, this term gets large.
The importance of a large margin outweighs avoiding mistakes and classifying known data points.
Should you scale your data in a SVM model? - ANSWER-Yes, so the orders of magnitude are
approximately the same.
Data must be in bounded range.
Common scaling: data between 0 and 1
a. Scale factor by factor
b. Linearly
How should you find which coefficients to hold value in a SVM model? - ANSWER-If there is a coefficient
who's value is very close to 0, means the corresponding attribute is probably not relevant for
classification.