Examen

BIDA 630 Data Analytics Final Questions With Answers Graded A+ Assured Success

Note

Vendu

Pages

Grade

A+

Publié le

15-09-2024

Écrit en

2024/2025

True or false: Bar charts are useful for comparing a single statistic (e.g. average, count, percentage) across groups. The height of the bar represents the value of statistic, and different bars correspond to different groups. - ️️True Assume that you are running Neural platform in JMP Pro. Which penalty method should be chosen if your data set has large number of X variables, and you think that a few of them contribute more than others to the predictive ability of the model? [ No penalty ; Absolute ; Logarithmic ; Squared ] - ️️Absolute To obtain an honest estimate of future classification error, we use the classification matrix that is computed from ________. - ️️Validation data Identify whether the task required is supervised or unsupervised learning: Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and nonbankrupt firms. - ️️Supervised learning, all information evaluated is known Identify whether the task required is supervised or unsupervised learning: Printing of custom discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously. - ️️Unsupervised learning; outcomes are unknown True or false: The test data are used to build models, or to further tweak the model or improve its fit. - ️️False _____________ is used for assessing the performance of the final chosen model on new data - ️️The test data partition When a model is fit to training data, zero error with those data is not necessarily good. This special case is called ______. - ️️Overfitting Which of the following are the most popular visualization tools in JMP_Pro? - ️️Graph Builder, Fit Y by X, Distribution Scatter plots play important role in prediction. Next step can be developing a model. Scatter plots provide information about relationships (linear or non-linear) between variables. The variables in scatter plot ________. - ️️NumericalIn a box plot, the box include %50 of the data, the horizontal line represents (i)____________, the top and bottom of the box represent (ii)________, respectively. - ️️(i) the Median (50th percentile); (ii) the 75th and 25th percentiles In JMP a diamond is displayed in the box, where the center of the diamond is _________. - ️️The mean The density ellipsoid in scatterplot matrix is a good graphical indicator of the correlation between two variables. The ellipsoid collapses diagonally as the correlation between the two variables approaches either 1 or -1. The ellipsoid is more circular if the two variables are more correlated. (TRUE or FALSE?) - ️️False; The ellipsoid is more circular (less diagonally oriented) if the two variables are less correlated True or False: Sensitivity and Specificity are plotted on an ROC Curve. - ️️True How do you calculate the error rate on a classification matrix (Confusion Chart)? - ️️Total incorrect predictions / total predictions The 'portion' of a lift curve represents what percent of the data, and how is this portion sorted? - ️️The portion (portion = .2 = p) represents the top p% (20%) of the data, as sorted by their predicted probability of predictor The lift of a lift curve represents what? - ️️The lift value (lift = 2.2) represents the relative likelihood of finding a certain predictor relative to the likelihood of finding that predictor amongst the overall proportion of that predictor (lift = 2.2 means you are 2.2 times more likely to find that predictor in that data set) True or false: Principal Component Analysis (PCA) is intended for use with quantitative values - ️️True True or false: The idea of PCA is to find a linear combination of the two variables that contains most, even if not all, of the information, so that this new variable can replace the two original variables. - ️️True How would the correlations change if we normalized the data first? - ️️Correlations will not change, since data are normalized by computing correlations

Montrer plus Lire moins

Établissement

BIDA 630 Data Analytics

Cours

BIDA 630 Data Analytics

Aperçu du contenu

BIDA 630 Data Analytics Final

True or false: Bar charts are useful for comparing a single statistic (e.g. average, count,
percentage) across groups. The height of the bar represents the value of statistic, and
different bars correspond to different groups. - ✔️✔️True

Assume that you are running Neural platform in JMP Pro. Which penalty method should
be chosen if your data set has large number of X variables, and you think that a few of
them contribute more than others to the predictive ability of the model? [ No penalty ;
Absolute ; Logarithmic ; Squared ] - ✔️✔️Absolute

To obtain an honest estimate of future classification error, we use the classification
matrix that is computed from ________. - ✔️✔️Validation data

Identify whether the task required is supervised or unsupervised learning: Predicting
whether a company will go bankrupt based on comparing its financial data to those of
similar bankrupt and nonbankrupt firms. - ✔️✔️Supervised learning, all information
evaluated is known

Identify whether the task required is supervised or unsupervised learning: Printing of
custom discount coupons at the conclusion of a grocery store checkout based on what
you just bought and what others have bought previously. - ✔️✔️Unsupervised learning;
outcomes are unknown

True or false: The test data are used to build models, or to further tweak the model or
improve its fit. - ✔️✔️False

_____________ is used for assessing the performance of the final chosen model on
new data - ✔️✔️The test data partition

When a model is fit to training data, zero error with those data is not necessarily good.
This special case is called ______. - ✔️✔️Overfitting

Which of the following are the most popular visualization tools in JMP_Pro? -
✔️✔️Graph Builder, Fit Y by X, Distribution

Scatter plots play important role in prediction. Next step can be developing a model.
Scatter plots provide information about relationships (linear or non-linear) between
variables. The variables in scatter plot ________. - ✔️✔️Numerical

, In a box plot, the box include %50 of the data, the horizontal line represents
(i)____________, the top and bottom of the box represent (ii)________, respectively. -
✔️✔️(i) the Median (50th percentile); (ii) the 75th and 25th percentiles

In JMP a diamond is displayed in the box, where the center of the diamond is
_________. - ✔️✔️The mean

The density ellipsoid in scatterplot matrix is a good graphical indicator of the correlation
between two variables. The ellipsoid collapses diagonally as the correlation between the
two variables approaches either 1 or -1.
The ellipsoid is more circular if the two variables are more correlated. (TRUE or
FALSE?) - ✔️✔️False; The ellipsoid is more circular (less diagonally oriented) if the
two variables are less correlated

True or False: Sensitivity and Specificity are plotted on an ROC Curve. - ✔️✔️True

How do you calculate the error rate on a classification matrix (Confusion Chart)? -
✔️✔️Total incorrect predictions / total predictions

The 'portion' of a lift curve represents what percent of the data, and how is this portion
sorted? - ✔️✔️The portion (portion = .2 = p) represents the top p% (20%) of the data,
as sorted by their predicted probability of predictor

The lift of a lift curve represents what? - ✔️✔️The lift value (lift = 2.2) represents the
relative likelihood of finding a certain predictor relative to the likelihood of finding that
predictor amongst the overall proportion of that predictor (lift = 2.2 means you are 2.2
times more likely to find that predictor in that data set)

True or false: Principal Component Analysis (PCA) is intended for use with quantitative
values - ✔️✔️True

True or false: The idea of PCA is to find a linear combination of the two variables that
contains most, even if not all, of the information, so that this new variable can replace
the two original variables. - ✔️✔️True

How would the correlations change if we normalized the data first? - ✔️✔️Correlations
will not change, since data are normalized by computing correlations

True or false: Pairs of variables that have a very strong (positive or negative) correlation
contain duplicative information. Therefore, we want to omit the variables that are
strongly correlated to others to avoid multicolinearity (when fitting models). - ✔️✔️True

??? Which of the following are the methods that we use for dimension reduction? (4
correct answers) - ✔️✔️Removing independent variables from the model ; random
selection of variables for model development ; logistics regression ; removing one of the

Signaler une violation de copyright

École, étude et sujet

Établissement: BIDA 630 Data Analytics
Cours: BIDA 630 Data Analytics

Infos sur le Document

Publié le: 15 septembre 2024
Nombre de pages: 5
Écrit en: 2024/2025
Type: Examen
Contient: Questions et réponses

Sujets

bida 630 data analytics

$8.89

Accéder à l'intégralité du document:

Rédigé par des étudiants ayant réussi

Disponible immédiatement après paiement

Lire en ligne ou en PDF

Faites connaissance avec le vendeur

Brainarium

3.8

(326)

Faites connaissance avec le vendeur

Brainarium Delaware State University

Voir profil

Vendu

1896

Membre depuis

2 année

Nombre de followers

1044

Documents

22794

Dernière vente

1 jours de cela

3.8

326 revues

151

Documents populaires

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur Brainarium. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour $8.89. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis) 50813 résumés ont été vendus ces 30 derniers jours Fondée en 2010, la référence pour acheter des résumés depuis déjà 16 ans