CFA LEVEL I (Vol.1) R2
Data - answer a collection of number panel datas, characters, words, and text—as well
as images, audio, and video—in a raw or organized format to represent facts or
information. To choose the appropriate statistical methods for summarizing and
analyzing data and to select suitable charts for visualizing data, we need to distinguish
among different data types. We will discuss data types under three different
perspectives of classifications: numerical versus categorical data; cross-sectional vs.
time-series vs. panel data; and structured vs. unstructured data.
Continuous data are - answer data that can be measured and can take on any
numerical value in a specified range of values. For example, the future value of a lump-
sum investment measures the amount of money to be received after a certain period of
time bearing an interest rate. The future value could take on a range of values depend-
ing on the time period and interest rate. Another common example of continuous data is
the price returns of a stock that measures price change over a given period in
percentage terms.
Numerical Data - answer Numerical data are values that represent measured or
counted quantities as a number and are also called quantitative data. Numerical
(quantitative) data can be split into two types: continuous data and discrete data.
Discrete data - answerare numerical values that result from a counting process. So,
practically speaking, the data are limited to a finite number of values. For example, the
frequency of discrete compounding, m, counts the number of times that interest is
accrued and paid out in a given year. The frequency could be monthly (m = 12),
quarterly (m = 4), semi-yearly (m = 2), or yearly (m = 1).
Categorical data (also called qualitative data) - answerare values that describe a quality
or characteristic of a group of observations and therefore can be used as labels to
divide a dataset into groups to summarize and visualize. Usually they can take only a
limited number of values that are mutually exclusive. Examples of categorical data for
classifying companies include bankrupt vs. not bankrupt and dividends increased vs. no
dividend action.
Nominal data - answercategorical values that are not amenable to being organized in a
logical order. An example of nominal data is the classification of publicly listed stocks
into 11 sectors, as shown in Exhibit 1, that are defined by the Global Industry
Classification Standard (GICS). GICS, developed by Morgan Stanley Capital
International (MSCI) and Standard & Poor's (S&P), is a four-tiered, hierarchical indus-
try classification system consisting of 11 sectors, 24 industry groups, 69 industries, and
, 158 sub-industries. Each sector is defined by a unique text label, as shown in the
column named "Sector."
Text labels - answera common format to represent nominal data, but nominal data can
also be coded with numerical labels. As shown below, the column named "Code"
contains a corresponding GICS code of each sector as a numerical value. However, the
nominal data in numerical format do not indicate ranking, and any arithmetic operations
on nominal data are not meaningful. In this example, the energy sector with the code 10
does not represent a lower or higher rank than the real estate sector with the code 60.
Often, financial models, such as regression models, require input data to be numerical;
so, nominal data in the input dataset must be coded numerically before applying an
algorithm (that is, a process for problem solving) for performing the analysis. This would
be mainly to identify the category (here, sector) in the model.
Ordinal data - answerare categorical values that can be logically ordered or ranked. For
example, the Morningstar and Standard & Poor's star ratings for investment funds are
ordinal data in which one star represents a group of funds judged to have had relatively
the worst performance, with two, three, four, and five stars representing groups with
increasingly better performance or quality as evaluated by those firms.
A variable - answera characteristic or quantity that can be measured, counted, or
categorized and is subject to change. A variable can also be called a field, an attribute,
or a feature. For example, stock price, market capital- ization, dividend and dividend
yield, earnings per share (EPS), and price-to-earnings ratio (P/E) are basic data
variables for the financial analysis of a public company. An observation is the value of a
specific variable collected at a point in time or over a specified period of time. For
example, last year DEF, Inc. recorded EPS of $7.50. This value represented a 15%
annual increase.
Structured data - answerare highly organized in a pre-defined manner, usually with
repeating patterns. The typical forms of structured data are one-dimensional arrays,
such as a time series of a single variable, or two-dimensional data tables, where each
column represents a variable or an observation unit and each row contains a set of
values for the same columns. Structured data are relatively easy to enter, store, query,
and analyze without much manual processing. Typical examples of structured com-
pany financial data are:
•Market data: - answerdata issued by stock exchanges, such as intra-day and daily
closing stock prices and trading volumes.
•Fundamental data: - answerdata contained in financial statements, such as earnings
per share, price to earnings ratio, dividend yield, and return on equity.
•Analytical data - answerAnalytical data: data derived from analytics, such as cash flow
projections or forecasted earnings growth.