Data Science Foundation Fundamentals
with Complete Solutions
According to the example calculation in the video, what information do you have to have
in order to use calculus? - ANSWER-a function that describes the relationship between
price and sales
Feedback
In order to use calculus to find the best price for maximizing revenue, you must first
have a formula that says how sales are related to price.
Actionable Insights: - ANSWER-Data and data science is for doing. Need to focus on
things that are controllable (specific); be practical (ROI?) - impact is large enough to
justify the efforts. You want to build up: have sequential steps .
Agency of Algorithms and Decision Makers - ANSWER-Recommendations: algorithm
process you can accept/reject. Based on your shopping patterns may w suggest XYZ;
based on what you've read you may like this. Your own past behavior to give you
recommendations.
Human in the loop make and implement decisions. Self driving cars. You are there is
needed to intervene or make the final decision.
Human Accessible: algorithm makes the design, but you need to be able to understand
how it reached the decision. Online mortgage applications.
Machine-Centric: machine talks to other machines. Smart watch talks to phone. It's the
Internet of Things.
Aggregating Models - ANSWER-Any one guess maybe high maybe low. When you
combine (central limit theorem) several different models the errors tend to cancel out
and you end up with a composite estimate that's generally closer to the true value.
Takes extra time and effort but gives you multiple perspectives compensating on
weakness and improving strength. You can find the signal amid the noise. More stable.
Many eyes on the same problem
Analyst - ANSWER-Day to day data tasks
Web analytics, SQL, visualizations.
Good for business decision-making.
Anomaly detection - ANSWER-the process of identifying rare or unexpected items or
events in a data set that do not conform to other items in the data set. This can be
serendipity - unexpected insights untapped potential/values.
Finding anomalies: it can be fraud, process failure, potential value. All have in common:
they are outliers. they don't follow expected patterns.
Regression
Bayesian Analysis
Hierarchical Clustering
, Neural Networks
Dealing with rare events - leads to unbalanced models.
Difficult data (biometrics, multimedia)
API: Application Programming Interface - ANSWER-Isn't a source of data but rather it's
a way of sharing data, it can take data from one application to another or from a server
to your computer. It's the thing that routes the data, translates it, and gets it ready for
use. It allows you to access data and include it in your data science programing. JSON
is used here: JavaScript Object Notation (can include in Python and Java).
Social API (twitter, facebook)
Utilities (drop box, Google)
Commerce (stripe, mailchimp, slack)
It can become a process or an App.
What kind of data can be accessed with APIs? Both proprietary and open data.
Area Chart: - ANSWER-Similar to line charts, except the areas under the lines are filled
in.
Artificial Intelligence - ANSWER-Algorithms that learn from data; broadly: machine
learning.
Strong or General AI: a replica of the human brain that can solve any cognitive task.
Weak or Narrow AI: algorithms that focus on specific well-defined tasks.
You can't do AI without data science
Bar chart: - ANSWER-Over time for different items. showing distribution.
Bayes' Theorem - ANSWER-Posterior probability as a function of the likelihood, the
prior probability and the probability of getting the data you found. Used for medical
diagnosis. If the person tests positive, on a a test that is 90% effective, what is the
probability the person has the disease.
Big Data: - ANSWER-Def: unusual volume; velocity and variety.
You can do big data without the full toolkit of data science.
Bubble Charts - ANSWER-A type of scatter plot with circular symbols used to rank your
data with bubbles. Can overlaid on a map.
Business Intelligence - ANSWER-Getting insights to do something better in your
business. Emphasized speed, accessibility, insight. Often rely on structured
dashboards.
Data science helps set up BI, makes it possible. Business intelligence gives purpose to
data science. Collect and clean data; build model outcomes; find trends and anomalies.
C, C++, Java - ANSWER-General purpose languages for back end and maximum
speed. (JSON)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller CLOUND. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $11.49. You're not tied to anything after your purchase.