WGU D206 Data Processing (Data Cleaning)
Comprehensive Exam || With Questions &
Answers (100% Accurate)
Conceptial Researchers
conceptialresearch@gmail.com
2024
, WGU D206 Data Processing (Data
Cleaning) Comprehensive Exam || With
Questions & Answers (100% Accurate)
What are the 7 Phases of the Data Analytics Life Cycle? - ANSWER - 1. Business
Understanding/Discovery Phase
2. Data Acquisition
3. Data Cleaning
4. Data Exploration
5. Predictive Modeling
6. Data Mining
7. Data Reporting
Explain the Business Understanding & Discovery Phase - ANSWER - The analyst
defines the question(s) of interest that need to be answered. The analyst will
determine the needs of the stakeholders, assess resource constraints, and define
project outcomes.
Explain the Data Acquisition Phase - ANSWER - The process of collecting and
storing data for easy retrieval from a database or even a component of a data
warehouse. Web scraping and surveys can be used to acquire data.
Explain the Data Cleaning Phase - ANSWER - Also known as data cleansing, data
wrangling, data munging, & feature engineering. Analyst utilize SQL, Python, R or
excel to transform and modify data.
Explain Data Exploration Phase - ANSWER - In this phase the analyst begins to
understand the basic nature of the data, the relationships within in (between data
variables), the structure of the dataset, the presence of outliers, and the distribution
of the data. This phase utilizes data visualization tools and numerical
summaries( measures of central tendency and variability).
Explain the Predictive Modeling Phase - ANSWER - This phase allows the analyst to
move beyond describing the data by creating models that enable predictions of
outcomes. Python and R are used to automate the training and use of models.
Explain the Data Mining Phase - ANSWER - This phase looks for patterns in large
sets of data. Also called Machine Learning. A specialized segment of data mining
techniques that continually update to improve modeling over time.
What is a BIG difference between Data Exploration Phase and Data Mining Phase? -
ANSWER - Both phases uncover patterns; however, the main difference is:
(a) Data Exploration is the initial step to uncovering patterns using both manual and
automated methods.
(b) Data Mining is an in-depth step to discover patterns using only automated
methods such as Machine Learning.
, Name Tools & Techniques used in the Business Understanding & Discovery Phase -
ANSWER - Tools: Scope Statement, Stakeholder Register, Gannt Chart, Network
Diagram
Techniques: Critical Path Method, KPI, Budget Estimation, Schedule Estimation,
SWOT Analysis
Name Tools & Techniques used in the Data Acquisition Phase - ANSWER - Tools:
SQL, Web Scraping Software, Survey, Input Data (self-generated data), NoSQL
(used to collect unstructured data.
Techniques: ETL (Extract, Transform, Load), API (Application Programming
Interface), Web Scrapping
Name Tools & Techniques used in the Data Cleaning Phase - ANSWER - Tools:
Python, R, SQL, Excel.
Techniques: Data Reduction: optimize storage capacity, Modification,
Transformation, Anomaly Detection.
Name Tools & Techniques used in the Data Exploration Phase - ANSWER - Tools:
Distributions (normal or skewed curve), visualization tools (tableau, R, Python,
RStudio, and histogram), statistical tools such as mean, median and mode.
Techniques: Correlation Discovery, Pattern Discovery, Visualization (histogram,
charts, tables, boxplot, etc.), variability (Standard deviation, Quartiles)
Name Tools & Techniques in the Predictive Modeling Phase - ANSWER - Tools:
Python and R
Techniques: Data Modeling, Correlation Modeling, Regression Modeling, Time
Series Modeling, Cross Validation, Classification Models, & Training Models.
Name Tools & Techniques in the Data Mining Phase - ANSWER - Tools: Python and
R
Techniques: Training dataset to build models, testing dataset for model evaluation,
classification, clustering, AI, Machine Learning, Deep Learning.
Name Tools & Techniques in the Data Reporting Phase - ANSWER - Tools:
Dashboards, Tableau, Story telling (feature of tableau), Graph, charts, images, etc.
Techniques: Visualization and Stakeholder Communication.
Dealing with data types such as: unstructured, semi-structured, quantitative, and
qualitative AND quality like uniqueness, relevance, reliability, validity, and accuracy
which make access difficult are POTENTIAL PROBLEMS in what phase? -
ANSWER - Data Acquisition Phase
What DA Phase includes the following POTENTIAL PROBLEMS? With large
audience consumption, mistakes can cause bad business decisions and loss of
revenue. The using improper scales for graphs could push for inaccurate
interpretations of the story. - ANSWER - Data Reporting Phase
Lack of clear focus on the stakeholders, timeline, limitations, and budget which could
derail the analysis is a POTENTIAL PROBLEMS in what phase? - ANSWER -
Business Understanding and Discovery Phase