Fundamentals of Data Analytics / Science
Introduction
Data analysis is a process of inspecting, cleansing,
transforming, and modelling data with the goal of discovering
useful info, informing conclusions, and supporting decisions
making.
Data science is an interdisciplinary field that uses scientific
methods, processes, algorithms, and systems to extract
knowledge and insight from many structural and unstructured
data. In this sense, data science becomes a specific part of
data analysis.
For this course, processing data (data analytics) relates to data-
related tasks, from collection, preparation, analysis and
visualisation to curation and storage. Examples include Movie
Recommendation System, Customer Segmentation, Sentiment
Analysis Model, and Credit Card Fraud Detection.
Data analysis consists of four things:
1. Descriptive – What happened? I.e., check the data to
understand what is in the data.
2. Diagnostic – Why did it happen? Whatever we can
understand from the data leads us to getting some insight
from the data.
3. Predictive – What will happen next? Use the data to make
predictions for the future.
4. Prescriptive – How can we make it happen?
How do we analyse data?
Clustering – The data starts off without much info, for
example, clients. The goal of clustering is to group the
data, for example, grouping clients with similar
behaviours.
Classification – This occurs when we already know
something about the data, for example, people who tend
to spend more, or less. With these two groups, new data is
then classified into each group.