Serious games are developed with the goal of having a certain impact on players which goes beyond
mere entertainment. This purposedriven design is immanent to serious games and can be stated as
the key characteristic that distinguishes serious games from other digital games. Hence, verifying
that a serious game has the intended effect on the players needs to be an essential part in the
development process.
Without evaluation of the game, there is no evidence that the purpose of the game is achieved.
Ideally, a serious game is evaluated with members from the target group in a comprehensive
evaluation process.
Serious games are often used in new contexts as a medium of intervention, and in many cases there
are little to no existing successful examples for orientation. Thus, as there are few proven strategies
or approaches to be guided by, the design and development of serious games is often based on
literature, theories and to some part on intuition and personal experience
The overall goal of evaluation is to prove the game’s effectiveness and suitability with respect to its
designated purpose and application context. Reliable results are supposed to lend credence to
serious games, to convince diverse stakeholders and to inform future design approaches. Game
researchers can learn more about the relationships between game design elements and the resulting
player experience and thus gain insights into the impact of games in general. Additionally,
researchers as well as developers are supposed to gain experience from both successful and failed
game concepts and may thus improve in designing effective serious games. Users and intermediaries
also have to be convinced by the evaluation.
Recruitment of participants: Most evaluation processes follow the same general structure: The study
design is planned and set up, then participants are recruited and divided into different experimental
groups. Normally, there are at least two groups, one that is provided with the serious game, called
the treatment group, and one that does not use the game, called the control group.
Operationalisation: Starting from the overall purpose of a serious game and the underlying theories,
it has to be defined which concrete measurable aspects best reflect the game’s purpose. This process
is called operationalization.
, Choice of measurement methods: . There are many different types of these instruments like
questionnaires, physiological measures and observations, which all have different advantages and
disadvantages regarding objectivity, validity and clarity. Hence, it is recommended to combine
several kinds of methods to gain comprehensive insights into the effects of the game
Design of control group conditions: Most serious games are supposed to be applied in contexts in
which other (non-game) interventions with similar purposes already exist. In those cases, the
evaluation of a serious game does not only have to aim at verifying a general intended effect of the
game, but also has to take into account a comparison with existing solutions in order to answer the
question of whether the game is better than established non-game alternatives.
It is advised to use at least two control groups instead of one: one group that receives no training at
all, and one that receives a training with comparable contents as the game, but built on a different
method.
Consideration of time-dependent effects: Another challenge related to the experimental design is
timing. While it is already demanding to conduct a sound study with one point of measurement,
most serious games would benefit from long-term assessment: Experiments in which participants are
playing the game only once or for a short period of time, allow suggestions about short-term effects
of the game, but do not take into account any wearout effects. Due to its novelty, the game may
attract more attention than established alternatives, but the resulting motivation to play it and to
deal with the content might decrease after a while, impeding the impact of the game. Especially
evaluation processes for games that are supposed to support long-term motivation of players should
include this aspect as well.
Reach of effects: Besides considerations of time, it is also challenging to assess the reach of serious
games effects in terms of the transfer to real-life contexts. Ideally, serious games evaluation includes
both direct effects that playing the game has on players and subsequent effects that influence their
future behavior in everyday life [9]. While this aspect is somehow interwoven with long-term effects,
it does not describe effects over time, but rather defines on which level the effectiveness of a serious
game is evaluated.
Processing of results: Finally, the evaluation of serious games bears the challenge to draw
conclusions from results and meaningfully deploy them to improve the game. Evaluation does not
terminate at the point that data is collected, but should lead to a process of revision to make the
game more effective and appealing.
The framework of evaluation-driven design embeds the evaluation and design process into the
general structure of scientific working and can be considered as a “step towards a science of game-
based learning” and serious games. It offers guidance to when to evaluate, how to plan the
evaluation process, and which questions need to be answered beforehand. Other existing models
and existing frameworks either fall in one of the two categories of when and what can be evaluated.
Models and frameworks of both categories are presented in the second part of the section in order
to give additional guidance for an evaluation process. Models and frameworks which are not of
specific use for the evaluation of serious games, but are valid in other areas and might be considered
for the evaluation of serious games, are not explicitly mentioned. For those general methods, a list of
references is given in the section for further reading at the end of this chapter.
Framework of Evaluation-Driven Design