Machine Learning Final Exam with perfect answers 2024
1 view 0 purchase
Course
Machine Learning F
Institution
Machine Learning F
According to the formal definition of machine learning, "A computer program is said to learn from _______ with respect to some class of _______ and performance measure P, if its performance at ________, as measured by P, improves with __________". correct answers experience (E), tasks (T), tasks (...
Machine Learning Final Exam
According to the formal definition of machine learning, "A computer program is said to
learn from _______ with respect to some class of _______ and performance measure
P, if its performance at ________, as measured by P, improves with __________".
correct answers experience (E), tasks (T), tasks (T), experience (E)
Sparsity and Density correct answers Data sparsity and density describe the degree to
which data exists for each feature of all observations. So if a table is 80% dense, then
20% of the data is missing or undefined. This means it is 20% sparse.
Example of a regression problem correct answers Can I determine a country's currency
exchange rate based on its GDP?
Example of a classification problem correct answers Is an email message spam or not?
Min-Max Normalization correct answers Makes it so that the mean is zero
Decimal Scaling correct answers Transform the data by moving the decimal points of
values of feature F. The number of decimal points moved depends on the maximum
absolute value of F.
Discretization correct answers Transformation of continuous data into discrete
counterparts. This is a similar process as is used for binning. You do this because some
algorithms would only work with either continuous or discrete values
Reasons to discretize data (3) correct answers • Some algorithms require categorical or
binary features.
• Can improve visualization.
• Can reduce categories for features with many values.
Dummy Variables correct answers Transformation of discrete features into a series of
continuous features (usually with binary values).
Dummy variables are useful because: correct answers • Some algorithms only work
with continuous features.
• It is a useful approach for dealing with missing data.
• It is a necessary pre-step in dimensionality reduction such as with PCA (Principal
Component Analysis).
Examples of nominal (or discrete features) correct answers Color, shape, angle and
number of edges
Examples of continuous features correct answers temperature, height, weight, age
,Error due to Bias correct answers errors made as a result of the specified learning
algorithm.
Error Due to Variance correct answers Errors made as a result of the sampling of the
training data.
Dimensionality correct answers Represents the number of features in the dataset.
Knowledge Discovery Process (6) correct answers 1. Data Collection
2. Data Exploration
3. Data Preparation
4. Modeling
5. Validation & Interpretation
6. Knowledge
Classes with very unequal frequencies correct answers Imbalanced Data
What are Histograms good for? (5) correct answers • What kind of population
distribution does the data come from?
• Where is the data located?
• How spread out is the data?
• Is the data symmetric or skewed?
• Are there outliers in the data?
What are Box Plots good for? (4) correct answers • Is a feature significant?
• Does the location differ between subgroups?
• Does the variation differ between subgroups?
• Are there outliers in the data?
What are Odds Plots good for? correct answers • Is a feature significant?
• How do feature values affect the probability of occurrence?
• Is there a threshold for the effect?
What are scatter plots good for? correct answers • Is a feature significant?
• How do features interact?
• Are there outliers in the data?
Instance (4) correct answers • Thing to be classified, associated or clustered.
• Individual independent example of target concept.
• Described by a set of attributes or features.
• A set of instances are the input to the learning
scheme.
Feature correct answers • Property or characteristic of an instance.
• Features can be discrete or continuous.
, Class correct answers The attribute or feature that is described by the other features
within an instance.
Stratified Random Sampling correct answers -Sample from the data such that original
class distribution is maintained.
-Works for imbalanced data but is often inefficient
Sampling (2) correct answers • Sampling is typically used because, sometimes, it is too
expensive or time-consuming to use all of the available data to generate a model.
• The sample subset should permit the construction of a model representative of a
model generated from the entire data set.
Systematic Sampling (2) correct answers Select instances from an ordered sampling
window. This involves systematically selecting every kth element from the window,
where k = N/n, N is the population size and n is the sample size.
It risks interaction with irregularities in the data
Simple Random Sampling (3) correct answers Shuffle the data and then select
examples.
• Avoids regularities in the data.
• May be problematic with imbalanced data.
sampling with replacement correct answers Once an element has been included in the
sample, it is returned to the population. A previously selected element can be selected
again and therefore may appear in the sample more than once.
sampling without replacement correct answers Once an element has been included in
the sample, it is removed from the population and cannot be selected a second time.
Cluster Sampling (3) correct answers Group or segment data based on similarities,
then randomly select from each group.
• Efficient.
• Typically not optimal.
Match-based Imputation (2) correct answers • Impute based on similar instances with
non-missing values.
-Hot deck and Cold Deck
Cold-deck Imputation correct answers Fill in the missing value using similar instances
from another dataset.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller HopeJewels. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $17.49. You're not tied to anything after your purchase.