Measuring Patient Preferences Using DCE – Health Economics, Policy & Law (GW4580M)
Merel Hoogstad
Lecture 1 Theoretical foundation, course overview and survey development
DCE applications in health and health care
Measurement of patient preferences is a young field, but with increasing volume and recognition. Key applications in health:
1. Improve patient-centered care (e.g., shared decision making).
2. Provide direction for development of future care.
3. Extend traditional health technology assessments.
4. Estimating utility weights within QALY framework.
Example: Multiple Sclerosis (MS)
Three treatment options for Multiple Sclerosis: (1) Pills, (2) Injections, (3) Infusions. Each has its own unique characteristics, side effects and
risk/benefit ratio.
Patient preferences are very diverse: no injections, maximum
effectiveness, minimal impact on daily life.
Physician prefer the most optimal medical treatment with much less
emphasis on daily impact.
When multiple options are possible, shared decision making is
important.
Some decisions are too complex and require decision support tools
So-called ‘decision support tools’ convey (average or personalized) risk/benefit information. But patient preference measurement itself is
often left for unstructured and inefficient pre-surgical consultations. Important role for explicit patient preference measurement:
Provides physician with preference information.
Forces patients to carefully examine their preferences.
Example: Elective surgery
There are a lot of (non-)surgical options:
Non-surgical options: radiation therapy, chemotherapy, hormonal therapy and targeted therapy.
Surgical options: lumpectomy and mastectomy.
Example: Estimating EQ-5D QALY Tariffs
Traditional preference elicitation method is time trade-off (TTO), but there are many
disadvantages. The ‘next generation’ preference elicitation method is based on
discrete choice experiments. DCEs are less complex, makes it possible to include many
more elicitation tasks, which accommodates more realistic statistical models (e.g.,
without constant proportional time preferences). DCEs also have a solid theoretic
foundation.
EQ-5D Attributes and levels
Health is divided into 5 domains: mobility, self-care, usual activities, pain & discomfort,
anxiety & depression. Each domain has five levels, ranging from: no health problems
(1) to extreme health problems (5). Best possible health state = 1-1-1-1-1 and worst
possible health state = 5-5-5-5-5. In total there are 3125 health states.
TTO explained
1
, Measuring Patient Preferences Using DCE – Health Economics, Policy & Law (GW4580M)
Merel Hoogstad
Principle underpinning TTO: the worse the health state, the more time in full health (1-1-1-1-1)
someone would be willing to sacrifice in order to avoid it. Respondents are required to converge
towards their point of indifference (x).
Example: suppose the respondent is indifferent between 10 years in the impaired health state
and 9 years in perfect health. The health state is x/t = 9/10 = 0.9.
Obtaining the QALY tariff
Either let respondents evaluate all 3125 health states or evaluate a subset and use a regression analysis (e.g., OLS or Tobit regression) to
interpolate preferences. The regression results are the EQ-5D QALY tariff.
Problems with TTO
1. Complex: TTO cannot be conducted in unattended surveys.
2. Interviewers can influence results: requires training and quality control system.
3. Few choice tasks per respondents feasible: low statistical power, statistical models impose linear time preferences (unrealistic).
4. Health states worse than death are not possible: requires more difficult TTO based on additional assumptions, for example: with
10 years lead time (which places some evaluations 10 years into the future).
Origin of DCEs: Thurstone scaling
DCEs originated in the 1970s. Thurstone was the first to relate observed choice probabilities to utility of the options in paired comparisons.
Assumptions:
1. Choice probabilities reflect distances between options on a latent scale.
2. Preferences are distributed (normally) around the modal preference.
Example: Which beer do you prefer?
Difficult to choose your favorite, even more difficult to make a ranking.
Thurstone used pairwise comparisons to derive scale values for any arbitrary set of objects. Valuing 7, 10 or
20 objects requires respectively 21, 45, 190 comparisons.
What makes you choose one beer over another? You have to identify attributes, like: price, alcohol
percentage, kind of beer.
Some terminology
Different surveys
1. Thurstone scaling = 15 questions to value 6 health states.
2. DCE = 10 questions to value 243 health states.
3. BWS (best-worst-scale).
Questions:
Which was the easiest survey to complete?
Which was the paired comparison for Thurstone scaling, which the BWS and which the DCE?
How helpful was the color coding & level overlap?
Could Thurstone scaling use the same visual presentation as the DCE?
Best-worst-scaling (types I, II and III)
Type I (object-case)
2
, Measuring Patient Preferences Using DCE – Health Economics, Policy & Law (GW4580M)
Merel Hoogstad
Type II (profile-case)
Type III (multi-profile case)
Type III looks like a DCE, but beside asking for the
best option it also asks for the worst option.
Advantages of BWS (type I and II):
1. Far lower task complexity for respondents: no need to iterate towards a point of indifference and it can be conducted in
unattended online surveys.
2. Doesn’t rely on interviews that can bias results.
3. Many choice tasks per respondents feasible.
Disadvantages of BWS (type I and II):
1. BWS doesn’t comprise trade-offs. Hence, it’s essentially a ranking rather than a valuation method.
2. BWS doesn’t have a solid theoretical foundation (unlike TTO and DCE).
Disadvantages of DCEs
1. Even the crudest design and simplest statistical model will produce numbers, but to obtain valid and reliable results,
considerable expertise is needed.
2. Attribute and level selection is crucial but no strict guidelines exist. This requires elaborate qualitative research, using patient
interviews, focus groups, think aloud sessions, etc.
3. It’s rarely possible to include all possible attributes and levels, meaning that results are contingent upon leaving out non-
essential attributes but inclusion of most relevant aspects in the decision-making process.
4. Discrete choice models can be very complex, and more complex models require sophisticated experimental designs.
In summary, DCEs are difficult to do well.
Lancaster’s (1966) Theory of Value
Lancaster’s theory of value: ‘Subjective evaluations of a good (value/utility) are derived from the good’s characterics.’
Theoretical foundation for deconstructing objects in a number of attributes and levels.
Strategy: establish weights for each attribute and model the value of each object as the sum of its parts.
McFadden’s (1974) Random Utility Theory (RUT)
McFadden combined Thurstone’s work with Lancaster’s theory of value, which postulates that utility of a good is derived from its
characteristics. Integrating statistics, he developed the Thurstone model into the tractable econometric Random Utility Model (RUM),
which relates choices to utility of attributes and alternatives available. It’s now known as choice modeling.
McFadden’s Random Utility Theory states that utility is a latent construct that can be partitioned into a systematic (V) and a random,
unexplainable component (ε): Utility = V + ε with people choosing the option that provides the highest utility.
The utility function and preferences
If you fill in the systematic part, you obtain something like Utility = β1*Bread + β2*Meat + β3*Cheese + β4*veggies + ε and by making
assumptions on the unexplainable component (ε) it becomes possible to statistically estimate preferences from the observed choices
between different objects.
--> This is the topic of week 3.
Does RUT impose rational choices?
Respondents are assumed to make fully rational choices in the systematic (V) utility component. This implies:
3
, Measuring Patient Preferences Using DCE – Health Economics, Policy & Law (GW4580M)
Merel Hoogstad
1. Completeness: all possible choice options can be compared and ranked.
2. Non-satiation or Monotonicity: more of a good (attribute) is always better, ceteris paribus.
3. Transitivity: if a>b and b>c, then a>c.
4. Context-independence: preferences remain stable across different contexts.
However, how important is V compared to ε?
4