Chapter 5: Business intelligence & data analytics
5.1: Introduction:
Digitalization creates massive data sets:
• New, better and cheaper products & services
• Deeper insights into customer behavior → traditional analytics and engineering insufficient
The idea should be to complement human expert-based insights witch machine learning models
→ Both should learn to co-exist: not a man vs. machine fight!
The data scientist plays a key role in business intelligence and analytics:
• Quantitative skills: analytics, statistics, AI, …
• Business skills: finance, marketing, HR, ….
• Programming skills: SQL, Phyton, SAS, …..
• Creativity is key
Analytics is complex because of the sheer myriad of data and data sources available:
• Master data relates to the core entities a company is working with: customers, suppliers, ….
➢ Very static and uniformly defined across the various business units
• Transactional data: timing, quantity and items involved in a transaction (internal)
RFM-variables:
➢ Recency: how long ago was a transaction made?
➢ Frequency: how often is a transaction made?
➢ Monetary: what is the monetary value of a transaction?
• External data = data gathered from outside the company
Ex. social media data, weather data, search data, competitor data, …..
• Open data = data that anyone can access, use and share without copyright
Ex. government data, scientific data, …
• Big data < digitalized world → small data should not be forgotten!
➢ Volume = the amount of data (“data at rest”)
➢ Velocity = the speed at which data comes in and goes out (“data in motion”)
➢ Variety = data in its many forms
➢ Veracity = data in doubt
➢ Value = the data collected should have some kind of value
• Structured data: ex. number, name of a customer, ….
• Unstructured data: ex. product reviews, tweets, … → ± 80% of firm data
• Semi-structured data: ex. resumes, web pages, …
• Metadata = data that describes other data & data definitions
➢ Stored in the catalog of the database management system
➢ For example used in fraud detection: date of a picture, …
, Credit risk modeling: (risk management)
Banks have gathered plenty of information about default behavior: gender, income, date of birth, …
Banks have also accumulated lots of business experience about their credit products
→ Credit scoring analyzes both sources and decides which ones to accept or reject (Basel & IFRS 9)
Key assumption: the future resembles the past!
A cutoff represents the minimum required credit quality by the bank → otherwise rejection!
Fraud analytics: “A typical organization loses 5% of its revenues to fraud each year”
Fraud = an uncommon, well-considered, imperceptibly concealed, time-evolving and often carefully
organized crime which appears in many types and forms → challenges of a fraud detection system
Analytical fraud models are often represented as IF-THEN rules: if a rule fires for a particular claim, a
fraud investigation process is started to verify whether the claim is fraudulent or not!
Marketing analytics:
There are different kinds of marketing analytics:
• X-selling: to change the intended purchase behavior of a customer
➢ Up-selling = to sell more of a given, more expensive product
➢ Cross-selling = to sell an additional product or service
➢ Down-selling = to sell a lesser product or service → long customer relationship
• RFM analysis = a well-known and developed measurement framework used in marketing
across different industries (< Cullinan, 1977)
➢ Focusses on existing customers
➢ 20% of your customers are likely to generate 80% of your profit
➢ Set of metrics to monitor a customer’s behavior: recency, frequency & monetary
• Churn prediction: predicting which customer will you or decrease product/service usage
➢ The goal is to obtain long term loyal customers
➢ Losing costumers leads to opportunity costs because of reduced sales
➢ Attracting new customers is 5 to 6 times more expensive then customer retention
➢ Transaction buyers → buy because of a low price → more likely to churn
➢ Relationship buyers → interested in building a loyal relationship with the firm
➢ Subscription setting: churn = customer cancels contract
➢ Non-subscription setting: churn needs to be defined explicitly by the firm
➢ There are different types of churn:
1. Active churn = the customer stops the relationship with the firm
2. Passive churn = intensity of the relationship decreases
3. Forced churn = the company stops the relationship with the customer
4. Expected churn = the customer no longer needs the products/services
• Response modeling: will customers respond to a marketing campaign or not?
➢ Focus on customer acquisition OR on deepening costumer relationships
➢ Targeted marketing
➢ Off-line campaigns: catalogues, radio ads, flyers, …
➢ On-line campaigns: banners, e-mail advertising, search engine marketing, ….
➢ Implicit response: read e-mail, click on link, …
➢ Explicit response: purchase (and good review and word-of-mouth)