MINE 272 final exam
review(updated)questions & answers
Data analysis process - correct answer ✔✔1. Define problem
2. Data extraction
3. Data preparation
4. Data exploration and visualization
5. Predictive modelling, validation, and testing
Data preparation - correct answer ✔✔Step 3, cleaning and transformation
Transform input to new space to make problem easier to solve
Ex. PCA, scaling, log values
Data exploration and visualization - correct answer ✔✔Step 4, pattern recognition
Present data graphically and statistically to find patterns and connections
Ex. summarizing, grouping, relations between attributes
Predictive modelling, validation, and testing - correct answer ✔✔Step 5, selecting and evaluating model
Choose suitable statistical model for encoding relations, model then validated and tested (partition data)
Ex. regression, classification, clustering
,Dataset - correct answer ✔✔Collection of data composed of attributes and a response variable
Should be divided into testing and training datasets
Training set - correct answer ✔✔Tune parameters of adaptive model, inputs can be limited
Testing set - correct answer ✔✔Use trained model to predict target values of other points
A good predictor is well-generalized through pattern recognition
Supervised learning - correct answer ✔✔Training data has input with a corresponding target vector as
output
Ex. classification, regression (linear, logistic)
Classification - correct answer ✔✔Assigning input vectors to a finite number of discrete categories
(supervised learning)
Ex. rock type from an image, digit recognition
Regression - correct answer ✔✔Explains influence of variables on outcome of another variable
Output of a regression model is one or more continuous variables (supervised learning)
Ex. predicting real CO concentration, yield of chemical production plant (inputs T and P)
Linear regression - correct answer ✔✔Predict function returns prediction y (output) of a given x value
(input), based on linear regression model, analytical technique
Logistic regression - correct answer ✔✔Output is not continuous variable, binary outcome
, Based on logistic function where f(y) = e^y/1 + e^y for -infinity < y < +infinity
Unsupervised learning - correct answer ✔✔Training data does not have inputs with corresponding target
vectors as output
Ex. clustering (K-means clustering), density estimation
Clustering - correct answer ✔✔Discovers groups of similar examples within data, infinite number of
attributes (unsupervised learning)
Ex. ore grades of orebody
Density estimation - correct answer ✔✔Determine distribution of data within given input space,
converting high dimension data to 1D or 2D (unsupervised learning)
K-means clustering - correct answer ✔✔Finds k clusters in dataset with any number of attributes
(unsupervised learning)
Centroid = cluster mean
1. Define K, guess location of centroid
2. Compute the distance between the point and centroid (assign point closest centroid)
3. Compute centroid of each cluster
4. Repeat 2 and 3 until convergence (centroid stationary or oscillates)
WSS (within sum of squares) - if k+1 doesn't change WSS, number of k clusters is suitable
Time series - correct answer ✔✔Observations taken over time
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Sakayobako30. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $17.99. You're not tied to anything after your purchase.