D204: THE DATA ANALYTICS JOURNEY COMPREHENSIVE
BASIC ANALYTICAL QUESTIONS AND DETAILED
CORRECT ANSWERS 2024 UPDATE
Data scientists are able to find ______, _________, and _____ in unstructured data. -
(correct answer) order, meaning, and value
What is involved in the planning phase? - (correct answer) 1. Defining goals
2. Organizing resources
3. Coordinate people
4. Schedule project
What is involved in the wrangling phase? - (correct answer) 5. Get data
6. Clean data
7. Explore data
8. Refine data
What is involved in the Modeling phase? - (correct answer) 9. Create model
10. Validate model
11. Evaluate model
12. Refine model
What is involved in the Applying phase? - (correct answer) 13. Present model
14. Deploy model
15. Revisit model
16. Archive assets
____________________ are programming languages that are very frequently used for
data manipulation and modeling. - (correct answer) Python or R
___________ are general-purpose languages that are used for the back end, the
foundational elementsterm-26 of data science, and they provide maximum speed. -
(correct answer) C, and C++, and Java
____________________ is a language for working with relational databases to do
queries and data manipulation. - (correct answer) SQL
What does SQL stand for? - (correct answer) structured query language
This is where you actually create the statistical model and you do the linear regression.
You do the decision tree. You do the deep learning neural network. - (correct answer)
Modeling
,These are the developers, and the system architects, the people who focus on the
hardware and the software that make data science possible - (correct answer) Data
engineers
This is the phase of collecting data. - (correct answer) Data acquisition
Which phase? - Working with stakeholders to help them ask better questions so that
both they and you understand the outcome. - (correct answer) Discovery
What are the 4 parts of data analytics cycle? - (correct answer) Planning, Wrangling,
Modeling and Applying
This phase is also known as the discovery phase. During this phase, an analyst defines
the major questions of interest that need to be answered, understand the needs of the
stakeholders, and assess the resource constraints in the project. - (correct answer)
Business understanding
____________________ is the person who champions the vision of the project and has
the authority to allocate resources. - (correct answer) The project sponsor
__________________ is responsible for making sure things get done on time and
within budget and removes roadblocks. - (correct answer) Project manager
___________ is when new requirements are added to the project that increases the
time/resources needed to complete it. - (correct answer) Scope creep
What are the 3 types of analysis? - (correct answer) Descriptive, Predictive,
Prescriptive
___________________________ describes the data that is present. Mean, Median,
Mode, counting things. How many of each size and color of shirt were sold in the last
month? Do we sell more shirts in the summer vs winter? - (correct answer)
Descriptive analysis
____________________ makes predictions about future state of business. Forecasting
volumes for example. Based on last summer and winter, what will we sell next year? -
(correct answer) Predictive analytics
_______________________ analysis with an end goal of making a recommendation.
What colors and sizes of shirts should we sell to maximize profits? - (correct answer)
Prescriptive analytics
______________________ is just looking at any variable over time - (correct answer)
Time series analysis
,____________________ is a programing language that is specific to statistics. It also
has capabilities to visualize data. - (correct answer) R
_______________ is a multipurpose programing language that has libraries that extend
its capabilities to do statistical analysis. - (correct answer) Python
______________________ are platforms that specialize in visualization. This is where
you can make graphs and charts for presentations and data storytelling to executive
leaders. - (correct answer) Tableau and Power BI
_______________________ are instant messaging platforms that facilitate in a faster,
but less formal, way than email. - (correct answer) Teams, Slack
An European union law regulating their citizens must have informed consent and ability
to request or delete their own data that you collect. - (correct answer) GDPR
When the researching organization consciously ignores data that calls their results into
question or only presents one side of the results that puts them in a positive light. -
(correct answer) Conflict of interest
Sometimes data might not be available and the analyst will use tools such as web
scraping or surveys to acquire it during which phase? - (correct answer) Data
aquisition
The ____________ states that the sampling distribution of the sample means
approaches a normal distribution as the sample size gets larger (if you were to take 50
people out of that population and get the mean, then take another 50 random people
and get their mean age, and so forth, all of those means would follow the normal
distribution (bell curve)). - (correct answer) Central Limit Theorem
In this phase, the analyst begins to understand the basic nature of data and the
relationships within it. This phase often relies on the use of data visualization tools and
numerical summaries, such as measures of central tendency and variability. - (correct
answer) Data Exploration
__________________ enables an analyst to move beyond describing the data to
creating models that enable predicting outcomes of interest. - (correct answer)
Predictive Modeling
Tools such as _______________ play an important role in automating the training and
using of models. - (correct answer) Python and R
In this phase, an analyst tells the story of the data and uses graphs or interactive
dashboards to inform others of the findings from the analyses. - (correct answer)
Reporting and Visualization
, Even if you have a wide spread of a variable, let's say, age in a population, and you
take lots of sample groups, the mean age of those sample groups would tend to have a
normal distribution. - (correct answer) Central Limit theorem
This is the phase of collecting data. Frequently, data will be retrieved from a database,
perhaps a component of a data warehouse, by using a language like SQL. - (correct
answer) Data Acquisition
"Collect the data" is synonymous with ____________________ - (correct answer)
data acquisition
Exploring the data could be seen either in "________________" or "_____________" -
(correct answer) Prepare the data
Create a model
Predictive or data mining models could be considered in the
"_________________________" grouping. - (correct answer) Create a model
____________________ examines the distances between each point and the closest
point to it, and then compares these to expected values for a random sample of points
from a CSR (complete spatial randomness) pattern. - (correct answer) Nearest
Neighbor
______________ is a simple mathematical formula used for calculating conditional
probabilities. - (correct answer) Bayes' Theorem
Interactive dashboards tools, such as _____________, allow even the novice user the
ability to interact with the data and spot trends and patterns. - (correct answer)
Tableau
Data Acquisition (Step 5), Data Cleaning (Step 6), and Data Exploration (Step 7) in this
framework all fall under the "____________" domain. - (correct answer) "Wrangling"
domain.
The ______________ section would contain the ideas of predictive modeling as well as
data mining/machine learning. - (correct answer) "Modeling"
These are people who have extensive work in computer science and in mathematics.
They work in deep learning. They work in artificial intelligence. And they're the ones who
have the intimate understanding of the algorithms and understand exactly how they're
working with the data to produce the results that you're looking for. - (correct answer)
Machine learning specialists
They focus on domain-specific research like, for instance, physics and genetics are
common, so is astrophysics, so is medicine, so is psychology, and these kinds of
researchers, while they connect with data science, they are usually better versed in the
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller MEGAMINDS. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $11.99. You're not tied to anything after your purchase.