100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
ISYE 6501 Midterm 1 Questions with 100% correct answers | verified | latest update 2024 $7.99   Add to cart

Exam (elaborations)

ISYE 6501 Midterm 1 Questions with 100% correct answers | verified | latest update 2024

 5 views  0 purchase
  • Course
  • Institution

ISYE 6501 Midterm 1 Questions with 100% correct answers | verified | latest update 2024

Preview 2 out of 6  pages

  • June 22, 2024
  • 6
  • 2023/2024
  • Exam (elaborations)
  • Questions & answers
avatar-seller
ISYE 6501 Midterm 1

True or false: In a regression tree, every leaf of the tree has a different regression model
that might use different attributes, have different coefficients, etc. - ANS-True
- Each leaf's individual model is tailored to the subset of data points that follow all of the
branches leading to the leaf.

True or false: Tree-based approaches can be used for other models besides regression.
- ANS-True
- For example, a classification tree might have a different SVM or KNN model at each
leaf. It might even use SVM at some leaves and KNN at others (though that's probably
rare).

A common rule of thumb is to stop branching if a leaf would contain less than 5% of the
data points. Why not keep branching and allow models to find very close fits to each
very small subset of data? - ANS-Fitting to very small subsets of data will cause
overfitting.
- With too few data points, the models will fit to random patterns as well as real ones.

True or False: When using a random forest model, it's easy to interpret how its results
are determined. - ANS-False
- Unlike a model like regression where we can show the result as a simple linear
combination of each attribute times its regression coefficient, in a random forest model
there are so many different trees used simultaneously that it's difficult to interpret
exactly how any factor or factors affect the result.

A logistic regression model can be especially useful when the response... - ANS-- ...is a
probability (a number between zero and one).
- ...is binary (either zero or one).
- Logistic regressions can be useful for either situation.

A model is built to determine whether data points belong to a category or not. A "true
negative" result is: - ANS-A data point that is not in the category, and the model
correctly says so.
- True' and 'false' refer to whether the model is correct or not, and 'positive' and
'negative' refer to whether the model says the point is in the category.

, True or False: The most useful classification models are the ones that correctly classify
the highest fraction of data points. - ANS-False
- Sometimes the cost of a false positive is so high that it's worth accepting more false
negatives, or vice versa.
PreviousNext

Adjusted R-squared/Adjusted R2 - ANS-Variant of R2 that encourages simpler models
by penalizing the use of too many variables

Akaike information criterion (AIC) - ANS-Model selection technique that trades off
between model fit and model complexity. When comparing models, the model with lower
AIC is preferred. Generally penalizes complexity less than BIC

Algorithm - ANS-Step-by-step procedure designed to carry out a task.

Area under curve/AUC - ANS-Area under the ROC curve; an estimate of the
classification model's accuracy. Also called concordance index

ARIMA - ANS-Autoregressive integrated moving average.

Attribute - ANS-A characteristic or measurement - for example, a person's height or the
color of a car. Generally interchangeable with "feature", and often with "covariate" or
"predictor". In the standard tabular format, a column of data

Autoregression - ANS-Regression technique using past values of time series data as
predictors of future values.

Autoregressive integrated moving average (ARIMA) - ANS-Time series model that uses
differences between observations when data is nonstationary. Also called Box-Jenkins.

Bayes' theorem/Bayes' rule - ANS-Fundamental rule of conditional probability:
𝑃𝑃(𝐴𝐴|𝐵𝐵) = 𝑃𝑃(𝐵𝐵|𝐴𝐴)𝑃𝑃(𝐴𝐴)

Bayesian Information criterion (BIC - ANS-Model selection technique that trades off
model fit and model complexity. When comparing models, the model with lower BIC is
preferred. Generally penalizes complexity more than AIC.

Bayesian regression - ANS-Regression model that incorporates estimates of how
coefficients and error are distributed

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller AnswersCOM. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

75759 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$7.99
  • (0)
  Add to cart