Data Mining Final Exam 2024 Questions with 100% correct Answers
9 views 0 purchase
Course
Data Mining
Institution
Data Mining
Data Mining Final Exam 2024 Questions with 100% correct
Answers
In a data set with 22 variables, if 13% of the values, randomly spread across observations, are missing
(blank), what is the probable percent of complete and usable observations?
4.67
(1 − 0.13)22 = 0.0467 or 4.67%.
In a data...
Data Mining Final Exam 2024 Questions with 100% correct
Answers
In a data set with 22 variables, if 13% of the values, randomly spread across observations, are missing
(blank), what is the probable percent of complete and usable observations?
4.67
(1 − 0.13)22 = 0.0467 or 4.67%.
In a data set with 20 variables, if 8% of the values, randomly spread across observations, are missing
(blank), what is the probable percent of complete and usable observations?
(1 − 0.08)20 = 0.1887 or 18.87%.
When performing an analysis, one technique is called RFM. Which of the following is not reflective of
RFM?
Relevancy;
RFM is the acronym for recency, frequency, and monetary.
Mark wants to have a better understanding of his client base at the credit union. To do so, he is
running a report to show loan amount approval with corresponding credit scores. He realized the data
set is quite large and wants to create categories by grouping. To do this, he needs to do all the
following except
Remove 20% of the data to create a training set;
Binning is taking the entire data set, identifying the value to be binned into smaller groups, ensuring
no data overlapping, and labeling the bin accordingly.
,In R, Mary wants to understand the number of days between rain events in Chicago, IL. What function
is used to find the number of rain events between today and January 1, 2026?
diffitime
Using R, what is the formula that will allow for the weekday function to display the day of the week
for November 15, 2020?
>weekdays(as.Date("2020-11-15"))
Using R, what function is used to evaluate the categories in the variable to identify the dummy
variables?
ifelse
Michael is examining a data set and trying to determine which category he can transform into a
dummy variable. Of the four variables, Employee Number, Pay Rate, Hire Date, and Sex, which is the
best fit to use a dummy variable?
Sex
Marcus wants to include the month of the year in the analysis as categories. How many dummy
variables will be needed?
11;
If a given k categories = 12, then k − 1, or 12 − 1 = 11 dummy variables.
Kara is reviewing categories where a series of numbers represent the type of loan. She would prefer
the actual name of the loan be retained when running her analysis. Using Microsoft Excel, what
function will allow Kara to retain the category name instead of recording them in numbers?
IF function;
, An IF function allows for statements to be crafted to transform numbers into category names.
What data preparation technique is Maeve using when she extracts a payroll data set into two
separate files, one for hourly employees and one for salary employees?
Subsetting
Regression analysis captures the relationship between only two distinct variables.
False;
Regression analysis captures the relationship between 2 or more variables.
The response variable is the outcome of a variable, whereas the predictor is the input variable(s).
True
R2 in linear regression is the correlation coefficient.
False;
R2 in linear regression is the coefficient of determination, which is the proportion of the sample
variation in the response variable that is explained by the sample regression equation. The correlation
coefficient is the relationship between two variables.
R2, also known as the coefficient of determination, quantifies the proportion of the sample variation
in the predictor variables (xi) that is explained in the sample regression equation.
False;
R2 quantifies the sample variation of the response variable y that is explained in the sample
regression equation, not the predictor variables.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Edwardsus. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $12.49. You're not tied to anything after your purchase.