Data Science Final Exam |246
Questions with 100% Verified Answers
ICLR - ✔ ✔ which stands for the International Conference on
Learning Representation , ...
AI or machine learning - ✔ ✔ the main conferences are NIPS and ICML, and also
conferences like AI Stats, UAI, and KDD, which is more data scienceâ€"oriented ,
...
Stochastic gradient descent - ✔ ✔ ...
Algorithm - ✔ ✔ A series of repeatable steps for carrying out a certain type of
task with data
Angular JS - ✔ ✔ An open-source javascript library maintained by google and
the community. Lets you create single web page applications to display results
Artificial intelligence - ✔ ✔ The ability to have machines act with apparent
intelligence. Can be through symbolic logic or statistical analysis
Backpropagation - ✔ ✔ An algorithm for iteratively adjusting the weights used in
a neural network system. Often used to implement gradient descent.
,Bayes' Theorem - ✔ ✔ An equation for calculating the probability that something
is true if something potentially related is true. P(A|B) = P(B|A) * P(A) / P(B)
Good for situations where you need to know the amount of false
positives (diseases) - ✔ ✔
Bayesian network - ✔ ✔ Graphs that compactly represent the
relationship between random variables for a given problem
Bias - ✔ ✔ In machine learning when a learner consistently learns the same
thing wrong
Big data - ✔ ✔ Working with large datasets that usually require distributed storage
Binomial distribution - ✔ ✔ A distribution of independent events with two
mutually exclusive possible outcomes a fixed number of trials and a
constant probability of success. Discrete probability distribution. Graphed
using histograms.
,Centroid - ✔ ✔ Center of a cluster
Chi-square test - ✔ ✔ Statistical test of whether two categorical variables
are independent
Classification - ✔ ✔ The identification of two or more discrete categories for items
classic machine learning task. Spam or ham. Movie genres. Supervised learning.
Clustering - ✔ ✔ Unsupervised learning technique for dividing data into
groups based on an algorithm
Coefficient - ✔ ✔ A number or algebraic symbol prefixed as a multiplier to
a variable or unknown quantity (slope in line equation)
Computational linguistics - ✔ ✔ Also called natural language processing
(NLP) converting text of spoken languages into structured data to extract
valuable information
Confidence interval - ✔ ✔ A range specified around an estimate to indicate
margin of error combined with a probability that a value will fall in that range
Continuous variable - ✔ ✔ A variable whose value can be any of infinite values
Correlation coefficient - ✔ ✔ Measure of how closely two variables
correlate. Ranges from -1 to 1
, Correlation - ✔ ✔ The degree of relative correspondence between two variables
Covariance - ✔ ✔ A measure of the relationship between two variables
whose values are observed at the same time
Cross-validation - ✔ ✔ Set of techniques that divide up data into training sets
and test sets usually 80-20. Training sets are given the correct categorization
and an algorithm is created
CSV - ✔ ✔ Comma separated values common data file type
D3 - ✔ ✔ Data Driven Documents a JavaScript library that eases the creation
of interactive visualizations embedded in web pages
Data engineer - ✔ ✔ A specialist in data wrangling they build infrastructure for
real tangible analysis. Run ETL
Data Mining - ✔ ✔ The use of computers to analyze large data sets to look
for patterns that let people make business decisions
Data science - ✔ ✔ The ability to extract knowledge and insights from large
and complex data sets
Data structure - ✔ ✔ A particular arrangement of units of data such as an array
or a tree
Questions with 100% Verified Answers
ICLR - ✔ ✔ which stands for the International Conference on
Learning Representation , ...
AI or machine learning - ✔ ✔ the main conferences are NIPS and ICML, and also
conferences like AI Stats, UAI, and KDD, which is more data scienceâ€"oriented ,
...
Stochastic gradient descent - ✔ ✔ ...
Algorithm - ✔ ✔ A series of repeatable steps for carrying out a certain type of
task with data
Angular JS - ✔ ✔ An open-source javascript library maintained by google and
the community. Lets you create single web page applications to display results
Artificial intelligence - ✔ ✔ The ability to have machines act with apparent
intelligence. Can be through symbolic logic or statistical analysis
Backpropagation - ✔ ✔ An algorithm for iteratively adjusting the weights used in
a neural network system. Often used to implement gradient descent.
,Bayes' Theorem - ✔ ✔ An equation for calculating the probability that something
is true if something potentially related is true. P(A|B) = P(B|A) * P(A) / P(B)
Good for situations where you need to know the amount of false
positives (diseases) - ✔ ✔
Bayesian network - ✔ ✔ Graphs that compactly represent the
relationship between random variables for a given problem
Bias - ✔ ✔ In machine learning when a learner consistently learns the same
thing wrong
Big data - ✔ ✔ Working with large datasets that usually require distributed storage
Binomial distribution - ✔ ✔ A distribution of independent events with two
mutually exclusive possible outcomes a fixed number of trials and a
constant probability of success. Discrete probability distribution. Graphed
using histograms.
,Centroid - ✔ ✔ Center of a cluster
Chi-square test - ✔ ✔ Statistical test of whether two categorical variables
are independent
Classification - ✔ ✔ The identification of two or more discrete categories for items
classic machine learning task. Spam or ham. Movie genres. Supervised learning.
Clustering - ✔ ✔ Unsupervised learning technique for dividing data into
groups based on an algorithm
Coefficient - ✔ ✔ A number or algebraic symbol prefixed as a multiplier to
a variable or unknown quantity (slope in line equation)
Computational linguistics - ✔ ✔ Also called natural language processing
(NLP) converting text of spoken languages into structured data to extract
valuable information
Confidence interval - ✔ ✔ A range specified around an estimate to indicate
margin of error combined with a probability that a value will fall in that range
Continuous variable - ✔ ✔ A variable whose value can be any of infinite values
Correlation coefficient - ✔ ✔ Measure of how closely two variables
correlate. Ranges from -1 to 1
, Correlation - ✔ ✔ The degree of relative correspondence between two variables
Covariance - ✔ ✔ A measure of the relationship between two variables
whose values are observed at the same time
Cross-validation - ✔ ✔ Set of techniques that divide up data into training sets
and test sets usually 80-20. Training sets are given the correct categorization
and an algorithm is created
CSV - ✔ ✔ Comma separated values common data file type
D3 - ✔ ✔ Data Driven Documents a JavaScript library that eases the creation
of interactive visualizations embedded in web pages
Data engineer - ✔ ✔ A specialist in data wrangling they build infrastructure for
real tangible analysis. Run ETL
Data Mining - ✔ ✔ The use of computers to analyze large data sets to look
for patterns that let people make business decisions
Data science - ✔ ✔ The ability to extract knowledge and insights from large
and complex data sets
Data structure - ✔ ✔ A particular arrangement of units of data such as an array
or a tree