Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

ISYE 6501 Midterm 1 Updated 2024/2025 Verified 100%

Beoordeling
-
Verkocht
-
Pagina's
14
Cijfer
A+
Geüpload op
03-09-2024
Geschreven in
2024/2025

Does a SVM classifier need to be a straight line? - No, SVM can be generalized using kernel methods that allow for nonlinear classifiers. Software has a kernel SVM function that you can use to solve for both linear and nonlinear classifiers Should you scale your data in a SVM model? - Yes, so the orders of magnitude are approximately the same. Data must be in bounded range. Common scaling: data between 0 and 1 a. Scale factor by factor b. Linearly What if it's not possible to separate green and red points in a SVM model? - Utilize a soft classifier -- In a soft classification context, we might add an extra multiplier for each type of error with a larger penalty, the less we want to accept mis-classifying that type of point Rows - Data points are values in data tables Columns - The 'answer' for each data point (response/outcome) Structured Data - Quantitative, Categorical, Binary, Unrelated, Time Series Unstructured Data - Text Support Vector Model - Supervised machine learning algorithm used for both classification and regression challenges. Mostly used in classification problems by plotting each data item as a point in n-dimensional space (n is the number of features you have) with the value of each feature being the value of a particular coordinate. Then you classify by finding a hyperplane that differentiates the 2 classes very well. Support vectors are simply the coordinates of individual observation -- it best segregates the two classes (hyperplane / line).What do you want to find with a SVM model? - Find values of a0, a1,...,up to am that classifies the points correctly and has the maximum gap or margin between the parallel lines. What should the sum of the green points in a SVM model be? - The sum of green points should be greater than or equal to 1 What should the sum of the red points in a SVM model be? - The sum of red points should be less than or equal to -1 What should the total sum of green and red points be? - The total sum of all green and red points should be equal to or greater than 1 because yj is 1 for green and -1 for red. First principal component - PCA -- a linear combination of original predictor variables which captures the maximum variance in the data set. It determines the direction of highest variability in the data. Larger the variability captured in first component, larger the information captured by component. No other component can have variability higher than first principal component. it minimizes the sum of squared distance between a data point and the line. Second principal component - PCA -- also a linear combination of original predictors which captures the remaining variance in the data set and is uncorrelated with Z¹. In other words, the correlation between first and second component should is zero. . Soft Classifier - Account for errors in SVM classification. Trading off minimizing errors we make and maximizing the margin. To trade off between them, we pick a lambda value and minimize a combination of error and margin. As lambda gets large, this term gets large. The importance of a large margin outweighs avoiding mistakes and classifying known data points. How should you find which coefficients to hold value in a SVM model? - If there is a coefficient who's value is very close to 0, means the corresponding attribute is probably not relevant for classification.Does SVM work the same for multiple dimensions? - Yes . Can classification questions be answered as probabilities in SVM? - Yes. K Nearest Neighbor Algorithm - Find the class of the new point, Pick the k closest points to the new one, the new points class is the most common amongst the k neighbors.

Meer zien Lees minder
Instelling
ISYE 6501
Vak
ISYE 6501

Voorbeeld van de inhoud

ISYE 6501 Midterm 1
Does a SVM classifier need to be a straight line? - No, SVM can be generalized using kernel
methods that allow for nonlinear classifiers. Software has a kernel SVM function that you can use to
solve for both linear and nonlinear classifiers

Should you scale your data in a SVM model? - Yes, so the orders of magnitude are approximately
the same.

Data must be in bounded range.

Common scaling: data between 0 and 1

a. Scale factor by factor

b. Linearly



What if it's not possible to separate green and red points in a SVM model? - Utilize a soft classifier
-- In a soft classification context, we might add an extra multiplier for each type of error with a larger
penalty, the less we want to accept mis-classifying that type of point

Rows - Data points are values in data tables



Columns - The 'answer' for each data point (response/outcome)



Structured Data - Quantitative, Categorical, Binary, Unrelated, Time Series



Unstructured Data - Text



Support Vector Model - Supervised machine learning algorithm used for both classification and
regression challenges.

Mostly used in classification problems by plotting each data item as a point in n-dimensional space (n is
the number of features you have) with the value of each feature being the value of a particular
coordinate.

Then you classify by finding a hyperplane that differentiates the 2 classes very well. Support vectors are
simply the coordinates of individual observation -- it best segregates the two classes (hyperplane / line).

, What do you want to find with a SVM model? - Find values of a0, a1,...,up to am that classifies the
points correctly and has the maximum gap or margin between the parallel lines.



What should the sum of the green points in a SVM model be? - The sum of green points should be
greater than or equal to 1



What should the sum of the red points in a SVM model be? - The sum of red points should be less
than or equal to -1



What should the total sum of green and red points be? - The total sum of all green and red points
should be equal to or greater than 1 because yj is 1 for green and -1 for red.



First principal component - PCA -- a linear combination of original predictor variables which
captures the maximum variance in the data set. It determines the direction of highest variability in the
data. Larger the variability captured in first component, larger the information captured by component.
No other component can have variability higher than first principal component.

it minimizes the sum of squared distance between a data point and the line.



Second principal component - PCA -- also a linear combination of original predictors which
captures the remaining variance in the data set and is uncorrelated with Z¹. In other words, the
correlation between first and second component should is zero.

.



Soft Classifier - Account for errors in SVM classification. Trading off minimizing errors we make and
maximizing the margin.

To trade off between them, we pick a lambda value and minimize a combination of error and margin. As
lambda gets large, this term gets large.

The importance of a large margin outweighs avoiding mistakes and classifying known data points.




How should you find which coefficients to hold value in a SVM model? - If there is a coefficient
who's value is very close to 0, means the corresponding attribute is probably not relevant for
classification.

Geschreven voor

Instelling
ISYE 6501
Vak
ISYE 6501

Documentinformatie

Geüpload op
3 september 2024
Aantal pagina's
14
Geschreven in
2024/2025
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

€7,42
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
ACADEMICMATERIALS City University New York
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
606
Lid sinds
2 jaar
Aantal volgers
185
Documenten
10571
Laatst verkocht
2 uur geleden

4,0

98 beoordelingen

5
53
4
13
3
21
2
3
1
8

Populaire documenten

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen