USABILITY & UX EVALUATION
Summary
Thank you for buying this summary. It includes my notes of the lectures, the PowerPoint slides and all
the articles that provided extra information on top of the lectures. This means that not all articles are
summarized but only the ones that give additional information on top of the information that was
presented during the lectures. Hereby, I avoided a lot of repetition. Moreover, I broke up each
lecture to make sure that the entire summary has a logical order. Now you, for example, read about
certain methods that are discussed in lecture 1, then the article about these methods follows and
after that the rest of the lecture is discussed. So, all articles that correspond with something said in
the lecture are placed right after that lecture’s info. Lectures are indicated with purple and the
articles with orange.
Good luck! If you have any questions or remarks, let me know.
, 2
CONTENT
Week 1..................................................................................................................................................... 3
Introduction to evaluation .................................................................................................................. 3
Intermediate level knowledge............................................................................................................. 6
Week 2..................................................................................................................................................... 8
Quantitative measures: performance measures ................................................................................ 8
Logging data and analytics ................................................................................................................ 13
Week 3................................................................................................................................................... 14
Qualitative Methods.......................................................................................................................... 14
Expert reviews ................................................................................................................................... 15
Week 6................................................................................................................................................... 17
Self-report Methods .......................................................................................................................... 17
Continuation self-report methods .................................................................................................... 19
Week 7................................................................................................................................................... 20
Implicit Behavioral & Emotional Measurements .............................................................................. 20
Continuation Implicit Behavioral & Emotional Measurements ........................................................ 22
All methods in summary........................................................................................................................ 24
Quantitative methods ....................................................................................................................... 24
Qualitative methods .......................................................................................................................... 24
Both quantitative and qualitative (possible) ..................................................................................... 24
Summary © Marit Kamp
, 3
WEEK 1
Introduction to evaluation
This course is about usability and UX evaluation. During the first lecture it is explained what usability
and UX are and how these can be measured.
Usability, UX and usefulness
Usability, UX and usefulness are all performance measures, meaning that they all three can be used
to measure the performance of a system. Nevertheless, they all measure different aspects of this
performance.
Usability is a performance measure that is about the ease of use and the absence of frustrations
when someone uses a product. Usability consists of:
- Effectiveness. The accuracy and completeness with which users achieve specified goals.
- Efficiency. The resources expended in relation to the accuracy and completeness with which
users achieve goals.
- Satisfaction. The comfort and acceptability of use.
With specific users, for specific goals, in a specific context of use.
User experience (UX) is about a person’s perceptions and responses resulting from the use and/or
anticipated use of a product, system or service.
Even though usability and UX are different, they are connected to each other. The way they are
connected is still debated. There are two ideas about this connection.
1. Usability is part of UX.
2. UX is an extension of the satisfaction part of usability.
Usefulness is about to what extent a design is designed as the right thing, for the right purposes.
Measuring usability, UX and usefulness
All three are measured at different times.
- Usability: can only be measured later in the process.
- UX: must be measured before, during and after the design process (probably memorize the
figure below because it appears everywhere!).
- Usefulness: should be measured as early as possible but must be kept track of during the
process.
You can measure all three with concepts, prototypes, and products.
Summary © Marit Kamp
, 4
Usability, UX and usefulness summarized
To make it easy, I summarized the above points in a table.
About… Evaluate…
Usability The ease of use and absence of Later in the design process.
frustrations while using a product.
UX Perceptions and responses about or Before, during or after the design process.
towards a product.
Usefulness The extent to which a product As early as possible in the design process, but
designed for the right purposes. must keep track during the process as well.
What and why we measure
Evaluation has a prominent role in the User-Centered design process. It gives us information about
the performance of the system, gives us an idea of the experiences and possible usability issues that
might occur. It focusses on the behavior of the users and the system (the what) and the attitude of
the behavior and the system (the why).
We measure/evaluate to…
- …inform design decisions;
- …identify and fix design problems;
- …reduce design costs;
- …create a sense of involvement;
- … generate scientific/intermediate knowledge to be able to generalize for new products.
Setting up a study
Participants
When setting up an evaluation study you have to make sure you have the right number of
participants. You can gather your participants in several ways.
- Convenience sampling. This is recruiting people that you can get hold of. Therefore, the
sample does not represent the population being sampled.
- Purposive sampling. This is recruiting people with one or more special characteristics in
common.
The type of sample size depends on your evaluation type.
Evaluation Timing Subject Goal Output Moderation Sampl
type e size
Formative Early Prototypes High-level Qualitative Interactive 10-20
study issues/
questions
Summative Early/ Implementation Evaluate Qualitative Less 5-7
study midway usability/UX and interactive
quantitative
Verification Late Product as a Comparison Quantitative Little to no (not
study whole to (+reasons) interaction menti
benchmark oned)
Comparison Anytime Anything Compare Varies Varies +/- 30
study performance per
/preference group
Summary © Marit Kamp
, 5
Ethics
There are also a few ethical guidelines you must keep in mind.
1. Research must be beneficial and cause no harm.
2. Participants should never feel uncomfortable, physically or psychologically.
3. Assess potential risks, real and perceived, and mitigate these.
4. Stress that you are evaluating the product and not the participant.
5. Acknowledge the participant’s expertise.
6. Participants have the right to be informed. Do this by providing an informed consent. This
should include the purpose of the study, the procedure, risks and participant rights.
7. Participants should have the right to withdraw without a penalty and giving a reason.
8. Always obtain permission for recording audio or image before you start recording.
9. Keep participation confidential.
Reading: MacDonald & Atwood
Evaluation has been a dominant theme in HCI for decades. It is present in nearly every design model.
Design and evaluation are closely related activities that support and inform each other. The aim of
this paper is to start a dialogue about how to address the challenges facing evaluation in modern
contexts. To do this MacDonald and Atwood describe the history, present and future of evaluation in
HCI.
The history of evaluation
Kaye and Sengers identified five phases of evaluation.
1) System Reliability Phase (1940s-1950s)
The major concerns in this phase were minimizing the system fault time and quickly repairing
errors. Therefore, evaluation efforts tended to focus on system reliability, mostly in terms of
how long a system would function without failure.
2) System Performance Phase (1950s-1960s)
Computers became more stable and reliable. Therefore, people could focus on other things.
A common form of evaluation was called “acceptance tests” with which evaluators would
test how long it would take the system to process large amounts of data with minimal
downtime and errors. Thus, evaluation efforts tended to focus on issues related to system
performance, mostly in terms of processing speed.
3) User Performance Phase (1960s-1970s)
A focus on users became more prevalent because time-sharing systems grew in popularity
and computers were being used for non-programming tasks as well. Therefore, a new type of
evaluator was brought to the field: experimental psychologists. The focus of evaluation lied
on user performance.
4) Usability Phase (1970s - now)
By the time the computer systems were usable for everyone, the focus shifted from user
performance to usability. The usability phase focused on five metrics: time to complete tasks,
error rate, accuracy, task completion rate and satisfaction.
5) User Experience Phase (2000s - now)
People use computers in very different contexts than in the fourth phase. Therefore,
evaluation methods need to adapt to this. A major challenge is the lack of a shared
conceptual framework for UX.
Summary © Marit Kamp