Theory of a design goal:

Usability of Interactive Products

By Turkka Keinonen
  1. Usability as a design approach
  2. Usability as a product attribute
  3. Usability as a measurement
  4. Subjective usability measurements
  5. Usability and emotions
  6. Summary

Usability of products in general is discussed on a separate page.

En Español  In Finnish   Contents

The purpose of this chapter is to identify the multiple dimensions of usability. A comprehensive discussion concerning the meanings associated with usability is needed to construct a model of usability-related product evaluation and operational measures for it in the following chapters. The concept of usability is explicitly defined by a number of references in order to prepare the ground for the usability measurements. However, if the question of whether usability influences product preference is considered only with regard to these definitions the view might be unnecessarily restricted. Consequently, the following chapters aim at presenting an approach to usability which includes different related aspects of usability: experienced usability, apparent usability, user friendliness, quality of user interface, etc. Usability is implied, in addition to the definitions, by the conventions of the discipline, the tools used to assess it, the qualifications of the practitioners, etc.

To ensure usability, the product development processes have to include phases and objectives which support it. These perspectives related to the design process are discussed first. The ideas and objectives influencing the design process are transferred to the user as concrete product attributes. The second angle on usability discusses these properties by reference to usability guidelines. The third point of view on usability focuses on the interaction between users and products. Operational criteria and explicit definitions describing how well the interaction proceeds are discussed. This view is perhaps the most dominant within the human-computer interaction (HCI) community. The fourth angle takes the point of view of the user. It attempts to study the experience of a subject by describing subjective measures of usability, and by introducing approaches that link usability and human emotion.

Most usability considerations are related to computer software applications. It will be assumed that the views apply to smart products as well. If there is any doubt, the matter will be discussed. The views that will be presented describe the scope of usability. Before a summarising model of usability as a decision criterion can be presented, the properties of smart products and the interaction process have to be transferred from one field of application to another, i.e. from ergonomics and product development to consumer decision making.

Usability as a design approach

Usability has become an established field of activity in software development and is increasing in importance in the field of consumer product design as well (e.g. Jordan et al. 1996a,b; Wiklund 1994). It can be seen as a set of methods or design approaches, like usability engineering (UE) and user-centred design (UCD).

Usability engineering can be defined as a process whereby the usability of a product is specified quantitatively. After the product is built it can be demonstrated that it does or does not reach the required levels of usability (Tyldesley 1988). The UE process includes for example objective measures of interaction, definitions of system models, user models, models of the interfaces and the matches between these, multidisciplinary co-operation, graphical user interface (GUI) design techniques, development of standards, and prototyping activities (Butler 1996).

User centred design addresses early and continuous focus on users, empirical measurements, iterative design and multidisciplinary design teams (e.g. Schuler and Namioka 1993, Greenbaum and Kyng 1991, Gould and Lewis 1985, den Buurman 1997). Sometimes UE and UCD are used synonymously. Relevant problems include the appropriate utilisation of methods in the product development process to achieve good quality at an acceptable level of costs (see e.g. Mayhew and Bias 1994) and, for example, combining usability with quality systems.

Usability has been increasingly seen as part of the product development process instead of being a separate activity carried out by "the usability police", i.e. a department responsible for inspecting designed products for usability. Thus, methods that are easy to apply right from the early concept definition phase, and which give enlightening results with affordable costs and effort have been in the focus of interest (e.g. Nielsen 1995 'Discount usability engineering'; Botman 1996 'Do-it-yourself usability evaluation'; Thomas 1996 'Quick and dirty'). These views are very well suited to the integrated design approach familiar to industrial designers.

The process view of usability is essential in participatory design. One of its benefits is to tie users into the process and lower their resistance towards change in organisations. This applies to the traditional Scandinavian approach to participatory design, which deals with work environments and specific applications. In these projects real future users of the designs are involved. This study concentrates on smart products which are usually generic pieces of equipment. The user population cannot familiarise themselves with the products during the development stage, but a small subset of possible future users have to represent the population as a whole. Consequently, from the point of view of ordinary users the user interface development process is unknown and cannot influence their attitude formation concerning the products.

Usability as a design approach is recognised as an essential dimension of the idea of usability. However, because the process is not accessible to consumers and users of generic merchandise software and electronic products, it is not considered in greater depth in this study.

Usability as a product attribute

The second approach to usability, usability as a product attribute, defines the concept by naming examples of product or system properties or qualities that influence usability. There are many guidelines (e.g. parts of ISO 9241, Smith and Moisier 1984, Mayhew 1992) and usability style guides of software manufacturers that give detailed instructions for user interface development. In addition to these, there are more general lists of usability principles. These introduce desirable properties of interfaces on a general level applicable to different kinds of product. These principles define the concept of usability by focusing the goals of design. They can be seen as design objectives, general ideals within the discipline, the common ground of usability designers' thinking, and to some extent as product properties. However, as for the design process view above, from the users' point of view the only crucial consequences of these principles are the implemented product characteristics.

To make this approach concrete, a set of usability principles is derived from the following well-known references.
Source no: 1 2 3 4 5 6 7 8
consistency x x x   x x x x
user control x x       x x  
appropriate presentation x x x x x x x x
error handling and recovery x x x x x x x x
memory-load reduction x x x   x     x
task match     x x x x x x
flexibility x   x   x x x  
guidance, help         x x   x

  1. Shneiderman (1986), 'Eight golden rules of dialogue design';
  2. Apple Computer (1987), 'Human interface guidelines';
  3. Donald A. Norman (1988), 'Seven principles that make difficult task easy';
  4. Polson and Lewis (1990), 'Design for successful guessing';
  5. Nielsen (1993), 'Usability heuristics';
  6. Ravden and Johnson (1989), 'Evaluation check list for software inspection';
  7. ISO 9241-10, 'Dialogue principles' and
  8. Holcomb and Tharp (1991), 'Design for succesful guessing'.
Table 2.1 (above) presents the most frequently mentioned principles and the guidelines that address them (marked as x). Different references arrange the same aspects in different ways. Some dimensions that seem to be lacking in a guideline are included as subdimensions or even as items in a checklist. For some principles it is difficult to decide whether a guideline mentions them or not. Some themes are made explicit, while some are only implied. Sometimes the same principles are illuminated from the point of view of system design and sometimes from that of user behaviour. Thus, the table cannot be interpreted as absolutely exact. Neither does the comparison aim at suggesting some definitive set of principles as better or more comprehensive than another. The differences are due to the varying scopes of the different guidelines. The table is presented to illustrate the agreement between the guidelines on the principles. The principles below are recognised by several guidelines and many of them by almost all. The usability principles include the allocation of tasks between the user and the computer or a smart product. Requirements on the user's short-term memory and retrieval from long-term memory, and the perceived locus of control are all addressed. Necessary items of information are discussed: these include the need for confirmation procedures, shortcuts, feedback, error messages, etc. The number of features and generality versus specificity of commands are also considered. A great deal of attention is paid to the relation between the elements of the interface. Consistency, modes, sequences, alternative ways of interacting, reversal of actions, error handling, appropriate grouping of the information, the timing and duration of information presentation, and response times are all related to the way in which the elements of interaction are arranged. In a corresponding way simultaneous syntax refers to the layout design of display elements. The familiarity, precision, clarity and politeness of words, expressions, abbreviations and icons and qualities of visual presentation like the design of icons and screens, the use of colour, etc. are examples of the attention paid to low-level vocabulary elements in usability guidelines.

The objects of user interface design are often divided into four levels of abstraction – conceptual, semantic, syntactic, and lexical (see section 4.1.1 in the book). The syntactic and semantic levels, and the compatibility between them, are well represented in the guidelines and are the focus of interest within the usability discipline. What is not addressed in the cited usability guidelines are the properties of physical user interface components such as displays, buttons, etc. They are likely to have been considered too specific or just seen as means to realise the desirable principles of interaction. There may also be a division between disciplines and concepts. HCI practitioners are interested in software and, consequently, in dialogue principles rather than hard components and physical ergonomics. Reference is made to the conceptual level only by requiring appropriate links between the concept of interaction and solutions on the other levels – excluding some references to simple task structures.

Another central concept within HCI in addition to usability is user interface. Like usability, computer-user interface, user interface, or just interface is defined in a number of ways. Kuutti and Bannon (1991) suggest that the concept should be approached on three levels depending on the purpose of the discussion. These are

At the technical level the interface is "that part of the program that determines how the user and the computer communicate." (Newman and Sproull 1979) Applying the definition of interface at the technical level, usability is a property of a part of the system, the interface, seen as an object connecting the user to the functionality of the product. This kind of approach is implicitly approved by interface standards and guidelines by their very existence. The separation of interface from functionality is allowed, as the guidelines suggest general principles that are related to interfaces independent of the functionality of the product.

If the user–product–task–environment interaction is observed from a product design point of view, the product is the only component that can be manipulated. Users act in a situational context which designers try to understand. However, designers generally cannot influence users, tasks or environments. Sometimes users can of course be trained and the tools change the structure of the task to some extent. But what designers are essentially interested in are the robust qualities of the product that allow the product to adapt to the environment and to the requirements set by users expecting a high quality of use in all relevant tasks and situations when the product is used by the vast majority of the target group of customers. The usability which is loaded in the product, in design, manufacturing etc. is realised during human-product interaction. The qualities of the product do not change when somebody begins to use it. Rather the existing potential of the product is released.

ISO 9126 (ISO 1991) – a standard for the quality characteristics of software products – defines usability as "a set of attributes that bear on the effort needed for use, and on the individual assessment of such use, by a stated or implied set of users". The guidelines do their best to discover the attributes that influence 'the effort needed for use'. Consequently, usability as a property of a product can be expressed as the consistency, error-handling capabilities, task match, flexibility and guidance provided by the interface, the quality of representation, the user's control over the interaction, and the individual assessment of these factors by a stated or implied set of users. Consumer electronics as a product category implies almost the entire population as potential users.

On the basis of the discussion above it seems evident that consumers may evaluate products in terms of the qualities of their interfaces. The guidelines are rather analytical by nature and meant primarily to support expert usability inspection or product design by professional designers. They participate in defining the concept of usability, but have to be located in a more general context before they can be used to explain consumers' product evaluation.

Usability as a measurement

The usability engineering approach discussed above mentioned the operational quantitative measurements of interaction as one of the tools used in design. These methods call for specific definitions of usability. Usability defined in such a way as to allow these measurements is the meaning most often referred to when the idea is discussed. The three approaches discussed here have previously been presented by Brian Shackel (1991), Jacob Nielsen (1993) and ISO 9241 DIS part 11. These approaches raise questions about usability measurements at an operational level, about usability objectives and about the relationship between usability, utility, product acceptance and affect in relation to the interaction.

Shackel's approach

The outstanding approach to usability taken by Brian Shackel (1991) has been much used and modified (e.g. Chapanis 1991, Booth 1989), and was among the first to recognise the relativity of the concept in a number of respects. Shackel starts his presentation from a model of product perception, where acceptance is the highest level concept. The user or consumer is supposed to compare the properties of the product to the sacrifices needed to acquire it. In a purchase situation, utility, usability and likeability are balanced in a trade-off with the costs of the product. The best possible alternative is selected, i.e. it is acceptable. Thus, acceptance is a function of perceived

Utility refers to the match between user needs and product functionality, while usability refers to users' ability to utilise the functionality in practice. Likeability refers to affective evaluations, and costs include financial costs as well as social and organisational consequences. Having located usability in the context of acceptance, Shackel presents a descriptive definition. "Usability of a system or equipment is the capability in human functional terms to be used easily and effectively by the specified range of users, given specified training and user support, to fulfil the specified range of tasks, within the specified range of environmental scenarios," or in short "the capability to be used by humans easily and effectively" (Shackel 1991, 24). From the angle of consumers' product evaluation the short definition is adequate, because their own situation determines the context. 'Easily' refers to "a specified level of subjective assessment", and 'effectively' is equal to "a specified level of human performance".

According to Shackel, usability is a property of a system or a piece of equipment. The property is not constant, being relative in relation to users, their training and support, tasks and environments. Thus, the evaluation is context-dependent. The system or piece of equipment may be usable if it matches the combination of users, tasks, and environment. Usability has two sides, one related to subjective perception of the product and the other to objective measures of the interaction. The instruments, scales or aspects needed to isolate these are not explicated by the definition. Shackel recognises the ambiguousness of the definition and suggests a set of operational criteria. For a system to be usable it has to achieve defined levels on the following scales:

Shackel's model Figure 2.2 – Product acceptance in Shackel. The dimensions and concrete measurements of usability in Shackel (1991).

Shackel's idea of usability joins usability to other product attributes and higher level concepts. It provides a descriptive definition of the concept that refers to the complex framework of evaluation and finally suggests concrete measurable usability criteria. All these aspects are necessary for understanding usability and for appropriate use of the concept. Figure 2.2 summarises the usability-related concepts suggested by Shackel.

Nielsen's approach

Nielsen (1993) considers usability to be an aspect among others influencing product acceptance. Nielsen suggests that usability and utility together form the usefulness of a system. He makes this explicit: "…utility is the question of whether the functionality of the system in principle can do what is needed, and usability is the question of how well users can use that functionality." This view is also supported by for example Eason (1984) " Usability… can limit the degree to which a user can realize the potential utility of a computer system" and Grudin (1992). Grudin associates usability and utility with totally different disciplines, i.e. computer science and information system research. He takes the view that the differences also reflect on the design processes. Utility is defined first by the product managers, usability being subsequently optimised by the designers. Grudin heavily stresses a more integrated design process, but does not suggest that the concepts themselves should be merged.

Nielsen's model Figure 2.3 – Product acceptance by Nielsen. Usability together with utility are considered to influence the usefulness of the product. Usefulness is one of the attributes affecting acceptability. (Nielsen 1993)

The ability of the functions to help the user carry out a set of tasks is called utility. Usability is a concept that focuses on the problems of how users utilise these functions. Usefulness together with other perceived product attributes like cost, reliability, etc. are called practical acceptability in Nielsen's model of acceptance. To reach system acceptability, Nielsen adds the influence of social acceptability (see figure 2.3). Practical considerations of a product cover only one perspective on consumer product evaluation. The recognition of social influences on product acceptance is essential in assessing the importance of product attributes.

Nielsen does not present any descriptive definitions of usability, but considers the operational criteria to define the concept clearly enough (1993, 26-37).

ISO 9241 part 11 DIS

ISO 9241 DIS is a draft international standard for the ergonomic requirements for office work with visual display terminals (ISO 1994). Part 11 discusses usability for the purposes of product requirement specifications and product evaluation. In spite of the name, the definitions of part 11, "Guidance on usability", are also said to be applicable to other situations where a user is interacting with a product to achieve certain intended objectives. This extension makes usability a very general concept capable of wide application outside its conventional applications within the discipline of information technology. ISO 9241-11 was originally influenced by the European ESPRIT project MUSiC (Metrics for Usability Standards in Computing). (Bevan 1992) The standard has been widely adopted by HCI practitioners. (e.g. Jordan et al. 1996b)

ISO 9241 defines usability as "the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use."

Bevan and Macleod (1994), who discuss the ISO 9241 approach regarded usability as "a property of the overall system: it is the quality of use in a context." They consider the overall system to include working practices, the location and appearance of the product, individual differences between users, etc. The attributes of a product are only one contribution to the quality of use of an overall system. Consequently, the usability of a product is always studied in relation to users, goals and context. ISO 9241 separates usability from quality of work by selecting a specific point of view. Usability studies the quality of work by focusing on the product. While Shackel and Nielsen regard usability as an aspect of consumer product acceptance, ISO 9241 regards it as a special focus on the evaluation of the quality of work. Thus, the basic setting in ISO 9241 is not the most user-centred.

According to ISO 9241, the dimensions of usability are:

Effectiveness measures usability from the point of view of the output of the interaction. The first component of effectiveness, accuracy, refers to the quality of the output and the second, completeness, refers to the quantity of the output in relation to a specified target level. Efficiency relates effectiveness of interaction to resources expended. It may be measured in terms of mental or physical effort, time, materials or financial costs.

Bevan and Macleod (1994) present these relations as mathematical equations:

Aspects of efficiency in ISO 9241
temporal efficiency = effectiveness / task time
human efficiency = effectiveness / effort
economic efficiency = effectiveness / total cost

The resources applied to calculate temporal and economic efficiency are understandable measurable variables. The effort that is needed in defining human efficiency can be estimated by measures of cognitive workload. ISO 9241 links cognitive workload to usability through human efficiency. By definition it has no direct effect on usability. On the other hand, the negative influences of excessive as well as too low cognitive workload are discussed. Considering temporal and economic efficiency, it can be accepted that the resources needed for a task are generally not relevant on their own without comparison to the achievements. When human qualities are in question the situation is not that simple. Human wellbeing is a value in itself, and systems should also allow work on appropriate levels of cognitive workload without comparisons to effectiveness. The MUSiC project makes this clear: "if good performance can only be achieved at the cost of high invested effort, a system is not usable." (Bevan et al. 1991) In fact, the other dimensions of efficiency may in practice have absolute limits in addition to the relative ones. Typically consumers have limited funds for purchases and the price of the product is used as a cut off.

Satisfaction has two components – comfort and acceptability. ISO 9241 does not define in detail what these mean. The suggestion is that satisfaction can be measured subjectively by questionnaires like SUMI or QUIS (see chapter 2.4) or objectively by observing the behaviour of the users during an extended period of use. If all the other relevant variables could be standardised like in laboratory settings, these links between usability and behaviour would be appropriate independent of the scope of usability. The measures are, however, typically used in real life surroundings. If absenteeism is considered to be a relevant indicator of usability in conditions where other variables are not standard, usability is no peripheral variable, but a very comprehensive concept. Figure 2.4 summarises the dimensions of usability according to ISO 9241-11.

It seems that satisfaction is quite similar to the concept of attitude (see chapter 3). Attitude can be studied by observing verbal or non-verbal responses in the areas of cognition, affect or behaviour, but all these approaches reflect the same concept . If this approach is adopted by ISO 9241, comfort might be related to affective responses of users and acceptability to conative responses, which might be expressions of behavioural intentions or behaviours with respect to the product. SUMI and especially QUIS seem, however, to measure more than others the cognitive component of usability by focusing on beliefs, not on affective terms or behavioural intentions.

The ISO model Figure 2.4 – Dimensions of usability according to ISO 9241-11.

ISO 9241 pays attention neither to the user's prior experience with the products nor specific measures. But in the informative annex B it gives examples of "additional measures … for particular desired properties of the product which contribute to usability." These desired properties are called usability objectives, and the following examples are given by the annex:

In addition to usability objectives, the annex lists many examples of usability measures. User's experience and measures of usability are separate concepts in ISO 9241. Measures are classified according to which aspect of the interaction they describe. One part of the measurements describes the output of the interaction, its quality and quantity. Another part measures the resources devoted to the interaction in relation to the output, while the last part measures the subjective perception of the interaction from the point of view of the user. The measures are of secondary importance from the point of view of the definition and great freedom of choice is given to individual researchers in selecting suitable ones.

Summary and discussion

Shackel (1991) and Nielsen (1993) have described an approach to measuring usability using five different scales which do not include utility. They are in short:

Goal achievement as a usability criterion is referred to by a number of terms like utility, functionality or effectiveness. The basic idea in these criteria is that usability can be measured by finding out how well the task that should be done with the product is completed in practice. For a user to be able to complete tasks with a product, two conditions have to be met. First, the communication between the user and the product has to function. Second, the functionality of the product has to be sufficient in relation to the tasks. It seems that the first relation clearly belongs to usability and is included in all major approaches in the form of catastrophic errors that prevent users from completing the tasks. The second condition is a matter of disagreements. Excluding goal achievement from usability is reasonable in the sense that it is not purposeful to measure the usability of a product for tasks that the product is not meant for. A product can be usable only in the tasks for which it has been designed or should have been designed. If the tasks cannot be completed due to shortcomings in functionality, the problem does not necessarily lie with the products, but with the evaluator who has selected the wrong product for those tasks. If goal achievement is included in the battery of usability criteria, it might be said after a usability study that a mobile phone is not usable for text editing because essential features are missing. If goal achievement is excluded, it can be said that text editing with a mobile phone is extremely laborious and frustrating, or just state the test to be invalid. It is not clear which tasks should be completed with a product for it to be usable. Mobile phones are certainly not text editors, but in modern phones there are some features which do require textual input. Introducing flexibility to the arsenal of usability criteria introduces the problem of selecting appropriate tasks and conditions as a criterion. The wider the scope of tasks or the better performance with some peripheral tasks, the more flexible and more usable the system is.

In addition to goal achievement, interaction can be measured in terms of productivity. The time to finish a task is a relevant measure whenever the efficiency of interaction is regarded as important. For business applications even a small saving in time becomes important when repeated thousands of times.

Errors are sometimes considered to be the essence of usability (Chapanis 1991). However, there are conflicting approaches to the practical definition of an error. First, catastrophic errors are those user actions that lead users to problems from which they cannot recover themselves or which lead to incompletely finished tasks. The second way to define error is to regard all deviations from optimal performance as errors (e.g. Hollnagel 1991). It is possible to decrease the number of errors users commit by other means than making the interface more easy to use. For example the users may be punished for making errors. Shneiderman (1986) discusses system response times. It was noticed that longer response times for some functions made people make fewer errors. Systems that work slower should be more usable according to this measure. What was likely to happen was that the users considered the long response times as a kind of punishment. If they made an error they had to wait for a long time before they were able to correct it and proceed with the task. So, they were more concerned to avoid mistakes than to proceed fluently with the task, and perhaps to experiment with new functions.

Objectives related to user experience are often illustrated by a learning curve (see figure 2.5). This illustrates the development of user performance as a function of experience. Experienced user performance (EUP) is the level of performance where the improvement of performance has stopped or its enhancement has become notably slower. Learnability is a measurement that describes the rising of the curve from a situation of no experience to EUP or a specified part of the curve. In addition to these, the learning curve illustrates well the concept of guessability, i.e. the level of performance achieved without any experience.

Learning curveFigure 2.5 – Learning curve. The learning curve is a presentation of the level of users' performance over time. Typical measurements that can be illustrated by the learning curve include guessability, experienced user's performance (EUP) and learning time.

If Shackel's and Nielsen's approaches are compared to ISO 9241, major differences emerge. At first glance, satisfaction seems to be the only common scale. ISO 9241 does not recognise learnability, relearnability, task time or errors. Instead, it introduces concepts of effectiveness and efficiency that are not included in Shackel's and Nielsen's models. Why do the approaches seem so different? Is there a common core? The apparent differences are due to different points of view and different strategies for combining basic elements of user-product-task-context interaction. The concepts have to be further discussed to find the basic elements and essence of usability.

Shackel's and Nielsen's approaches unify three different aspects of usability:

The objective criteria are task time and the number or rate of errors, which are – if used in an appropriate manner – effective quantitative variables that enable the use of relative scales. Usability objectives related to user's experience are experienced user performance (EUP), the novice user's ability to learn and the casual user's ability to relearn the use of a product. Learnability might be studied for example by first measuring the initial performance of a novice user in terms of time and errors, repeating the measurements after a period of training, and calculating the differences in performance levels. Thus, it is not an elementary criterion but a combination of criteria.

The ISO 9241 definition is constructed using different views of usability. Effectiveness approaches from the perspective of the output of the interaction, its quality and quantity. Effectiveness observed without paying attention to resources is not sensible, if no reference is made to the match between product functionality and the requirements of the task. Thus, including effectiveness means including the anticipated utility of the system in usability. Efficiency describes the interaction from the process point of view, paying attention to the results and resources involved. Satisfaction refers to the user's point of view.

Figure 2.6 sums up the approaches presented by Shackel, Nielsen and ISO 9241. These are clearly different from those presented in the previous chapter that refer to the properties of user interfaces. These here stick strictly to the interaction. The approaches are weak in giving substance to usability. Instead, they tell us how to measure something we think we know but cannot precisely articulate. Nielsen avoids giving any descriptive definition whatsoever. Bevan and Macleod link usability to quality. ISO 8402 defines quality as "the totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs". Thus, usability is tied to the users' needs, which is conceptually difficult to tame.

Figure 2.6 – Measurements, objectives and views on usability. The figure summarises the approaches to usability in Shackel (1991), Nielsen (1993) and ISO 9241. It recognises three measures – the number of errors, performance time, and answers on a rating scale or scales; three design objectives – the experienced user's performance (EUP), learnability by novice users, and relearnability or retention over time by casual users; and three views – the process output view, i.e. utility, the resource usage view, i.e. efficiency, and the user's subjective view.

There are a number of usability criteria available. Are there any instructions that indicate which to use? Shackel, Nielsen and ISO 9241 provide no general rules to decide which combination of criteria should be applied in a specific interaction situation. The use of a number of different approaches simultaneously is recommended. The emphasis is on objective criteria, but satisfaction is said to be important when usage is voluntary (Bevan and Macleod 1994). The basic assumption in ISO 9241 is that at least one criterion is applied to reflect effectiveness, efficiency, and satisfaction. However, if objective measurements cannot be obtained, subjective satisfaction may provide indication of them (ISO 1994).

The usability of interaction may be evaluated by considering only the subjective satisfaction of a novice user, while the usability of another interaction might be defined as the speed of performance when used by a trained person. The qualities measured in both cases are called usability, even if they have very little in common. Different user-product-task-context-measurement combinations are all specific cases. Each time we evaluate usability we evaluate something that reflects qualities of user–product–task interaction in a context, and each time the test is changed, the evaluated qualities are incomparable. Usability in interaction A may be comprehensively different from usability in interaction B. It seems appropriate to accept that the usability of interactions A and B cannot be compared if A and B involve products from different product categories and/or users with different levels of experience. It cannot be said that a VCR at home used by a six-year-old child is more or less usable than an air traffic control system used by a trained professional. This reasoning leads us to ask at what point differences in interaction or contexts force us to change our usability criteria. If there is no generality between usability measures, the concept refers only to the discipline, its aims and conventions, not to any generic quality of a product or its use. For example Green (1997) stresses that low generalisability of results undermines the whole discipline of usability. All trials are one-off without possibilities to learn anything for future applications.

The motivation of e.g. ISO 9241 has been to define usability for use in product evaluation in the same definitive way as the physical dimensions of a product, for example. We need to be able to compare the usability of different products belonging to the same product category. Such a comparison is possible, strictly speaking, only when an identical test can be applied to all the products. Theoretically, the ambiguous selection of very different sorts of criteria may blur the concept of usability. In practice there seems to be mixed evidence concerning the correlation of usability criteria. Chapanis (1991) suggests that the rate of errors alone provides a good and reliable approximation of usability. Nielsen and Levy (1994) have found that users' subjective assessment of product usability provides a working approximation of objective usability for the purposes of discount usability evaluation. Conflicting views that call for the use of several simultaneous and complementary usability approaches are presented by Rowe et al. (1994), among others.

The criteria do not define the concept of usability. They set only examples of possible interpretations, which may be valuable in practice. Logically they are not satisfactory. Specific problems with the approaches discussed above include, first, the superficial argumentation concerning the relationship between general behavioural response with respect to products and usability. Second, the relationship between subjective product evaluation and concrete product properties is not dealt with care. Third, the nature of attitude, satisfaction, or whatever the subjective component is called, is not clear. The main evaluation criteria of smart product -human interaction, which are either included in usability or are related to it, are, however, made clear on a general level. These are usability, utility, and subjective satisfaction. We shall return to the contents that is given to these dimensions in chapter 4 (of the original book).

Subjective usability measurements

This chapter describes approaches to usability that focus on a subject's personal experience with a product or a system. Instead of theoretical considerations, available scales and measures are presented, discussed and compared. This approach presents the state of the art of measuring users' subjective experience as suggested by a set of well-known tools. Two examples of computer attitude questionnaires (EUCSI and TAM), three usability inquiries (SUMI, QUIS and PSSUQ), and one measure of mental workload (NASA-TLX) are included.

The aim of the discussion is to understand the concept of usability as proposed by the scales, subscales and items. The practical aspects of scale usage are not of interest here. TAM and SUMI are of special interest, because they will be applied in the usability attribute reference model (see chapter 4) and the preference test (see chapter 5). Some of the other scales are used in the process of developing the usability attribute inquiry (see appendix B). Others illuminate the background of subjective usability-related measurements.

End-User Computing Satisfaction Instrument (EUCSI)

Doll and Torkzadeh (1988) (see also Harrison and Rainer 1996) present a model of computing satisfaction – the End-User Computing Satisfaction Instrument (EUCSI). It is designed to measure satisfaction with a specific application. Satisfaction in EUCSI is defined as "an affective attitude towards a specific computer application by someone who interacts with the application directly". EUCSI includes the following dimensions: content, accuracy, format, timeliness and ease of use. Reported Cronbach alpha coefficients range from 0.65 to 0.89.

See figure 2.7 for a summary.

The scale suggests a passive role for the user, who evaluates the system and interaction predominantly on the basis of usefulness. Few items address the interaction between the user and the system. The scale is not accurate in analysing distinct system properties. According to EUCSI, user satisfaction is influenced by the usefulness of the system, characterised by accuracy, task match, and the topicality of presented information, and the users' beliefs concerning the qualities of presentation. There are few items addressing usability in the sense used by Shackel and Nielsen.

EUCSI model Figure 2.7 – End-User Computing Satisfaction Instrument, EUCSI. (Doll and Torkzadeh 1988). Italicised interpretations are by the author.

 

Technology Acceptance Model (TAM)

The Technology Acceptance Model (TAM) (Davis 1993) describes the relations between perceived qualities of a system usage, affective attitude, and behavioural responses to the system. Product design features act as an external stimulus. They are, however, not analysed at the level of product attributes, applications as such being the independent variables. The perception of the stimuli creates cognitive beliefs, which initiate an affective response. The affective response has an influence on consumer behaviour. The beliefs included in the model are perceived usefulness, "the degree to which an individual believes that using a particular system would enhance his or her job performance" and perceived ease of use, "the degree to which an individual believes that using a particular system would be free of physical and mental effort". The relationships between the factors are presented in figure 2.8. Attitude is determined by cognitive beliefs, i.e. TAM follows the model of attitude formation suggested by Fishbein and Ajzen (1975).

TAM Model Figure 2.8 – Technology Acceptance Model, TAM. (Davis 1993). Italicised interpretations are by the author.

Beliefs are measured with a 7-point Likert scale with high reliability (perceived usefulness r=0.97 and perceived ease of use r=0.91). The items of 'perceived usefulness' address the quality and quantity of work accomplished; special emphasis is given to measuring temporal efficiency (3 items) and the feeling of being in control of the work (3 items). 'Perceived ease of use' measures the user's ideas on learning to use the system, the control experienced over the device, and the mental effort involved in use. Table 2.3 presents the scale items of TAM: perceived usefulness and perceived ease of use.

TAM does not analyse the qualities of interaction or interface with the same precision as many usability questionnaires. It sets a good example in conceptually separating beliefs and affect from attitudes and also provides empirical evidence showing that ease of use affects attitude only moderately (b=0.13). In usability measurements perceived ease of use and satisfaction are often regarded as equal.

Table 2.3 – Scales of the Technology Acceptance Model, TAM. (Davis 1993)

TAM perceived usefulness

  1. Using product X improves the quality of the work I do.
  2. Using product X gives me greater control over my work.
  3. Product X enables me to accomplish tasks more quickly.
  4. Product X supports critical aspects of my work.
  5. Using product X increases my productivity.
  6. Using product X improves my job performance.
  7. Using X allows me to accomplish more work than would otherwise be possible.
  8. Using product X enhances my effectiveness on the job.
  9. Using product X makes it easier to do my job.
  10. Overall, I find product X useful in my job.

TAM perceived ease of use

  1. I find product X cumbersome to use.
  2. Learning to operate product X is easy for me.
  3. Interacting with product X is often frustrating.
  4. I find it easy to get product X to do what I want it to do.
  5. Product X is rigid and inflexible to interact with.
  6. It is easy for me to remember how to perform tasks using product X.
  7. Interacting with product X requires a lot of mental effort.
  8. My interaction with product X is clear and understandable.
  9. I find it takes a lot of effort to become skilful at using product X.
  10. Overall, I find product X easy to use.

Software Usability Measurement Inventory (SUMI)

Software Usability Measurement Inventory (SUMI) (Porteous et al. 1993, Kirakowsky 1996) was developed by the Human Factors Research Group (HFRG) of University College Cork, partly as a contribution to the ESPRIT project P5429, Metrics for Usability Standards in Computing (MUSiC). SUMI aims at measuring the perceptions and feelings of a typical user. In addition to the scales, SUMI provides software for scoring and a standardised reference database to support evaluation. It makes it possible to relate the scores of an individual measurement to the SUMI database to get an overview of the usability of a product without having to compare several alternatives.

The SUMI model Figure 2.9 – Software Usability Measurement Inventory, SUMI. (Porteous et al. 1993). Italicised interpretations are by the author.

The five subscales of SUMI are

They consist of ten items answered according to the alternatives agree-undecided-disagree. Reliability levels of the subscales range from Cronbach alpha a=0.71 to 0.85, and 0.92 for the global usability measurement.

The Questionnaire for User Interaction Satisfaction (QUIS)

The Questionnaire for User Interaction Satisfaction (QUIS) was developed at the Human / Computer Interaction Lab at the University of Maryland, College Park (Chin et al. 1988, Harper and Norman 1993) based on the scale for 'User evaluation of interactive computer systems' presented by Ben Shneiderman (1986). Many versions have been introduced with different amounts of subscales, items, and levels of reliability. The dimensions of QUIS version 7 are

'Overall user reactions' includes semantic differential items like 'terrible' vs. 'frustrating', 'dull vs. stimulating', etc. It does not address any specific properties of user interface or interaction. 'Screen factors' refers to beliefs concerning interface properties on the lexical level, for instance fonts and highlighting. It also covers the logic of the interface. The sequence of screens, user control, error recovery, and compatibility of operational sequences are addressed in a very detailed manner, e.g. "Use of reverse video: unhelpful vs. helpful". 'Terminology and system information' measures the understandability of the messages with many related items. 'Learning' covers the experience of learning, but also addresses beliefs concerning specific system characteristics such as feedback, logic of sequences and intuitiveness. 'System capabilities' refers to the users' experiences regarding the speed of performance, reliability, noise, error handling capabilities, and the flexibility of the system in relation to the user's experience; it is thus a very diverse category. In spite of this, quite high reliability coefficients have been reported (overall reliability for version 3 a= 0.94 and for version 4 a=0.89, Chin et al. 1988).

Many of the items in QUIS resemble a selection from an expert evaluation checklist rather than questions measuring user satisfaction. QUIS supposes concrete beliefs to determine the user's satisfaction. In addition, Chin et al. (1988) state that subjective satisfaction is equal to system acceptance. However, one may suspect that users are not likely to consider these kinds of attribute spontaneously if not explicitly asked. Thus, QUIS operates between the designer domain of concrete product attributes and the user domain of subjective experience. Due to its many references to concrete product attributes, QUIS, unlike some other scales, cannot be adapted for other interactive devices than software in visual display terminals.

Post-Study System Usability Questionnaire (PSSUQ)

Lewis (1995) introduces a set of questionnaires for different phases of usability evaluation: one for collecting immediate user response after a task in a usability test (After Scenario Questionnaire ASQ), another for post-study evaluation for usability tests (Post Study System usability questionnaire PSSUQ), and the third for field studies (Computer System Usability Questionnaire CSUQ). The measurements have been developed by IBM.

In ASQ, subjects rate the interaction according to the ease of task completion, their perception of the temporal efficiency of task completion, and the adequacy of support information. These are measured on a 7-point Likert scale (r=0.90 to 0.96). High correlation with task completion is reported. PSSUQ and CSUQ are both 7-point Likert scales applying the same items. The only difference is that PSSUQ addresses the specific tasks done during a usability test while CSUQ refers to the use of the system generally. Their construction involved the application of three subscales:

The overall reliability of the scales is very high (PSSUQ r=0.97 and CSUQ r=0.95).

PSSUQ Model Figure 2.10 – Post-Study System Usability Questionnaire, PSSUQ. (Lewis 1995). Italicised interpretations are by the author.

The scales have diagnostic power in system development, but conceptually they are not clear. Qualities of interaction and the logic of information presentation are measured by soliciting user beliefs, while the quality of presentation and physical interaction components are addressed with affective questions. Abstract relations of interface logic are measured with concrete items, and concrete properties of the interface are surveyed with general questions. However, this problem is related more to the labels of the scales than to the dimensions they measure, because Lewis (ibid.) reports a result of factor analysis that proves the dimensions to be statistically independent.

NASA Task Load Index

NASA Task Load Index (NASA TLX), has been developed by the Human Performance Group at the NASA Ames Research Center. It is a multi-dimensional rating instrument that provides an overall workload score. It is based on a weighted average of ratings on six subscales. The subscales provide diagnostic information about the sources of the work load. TLX includes the following dimensions:

The respondents are first asked to pairwise compare the importance of the dimensions of the task in question. This produces task-type-specific weights for the subdimensions. Subjects next rate the tasks they have carried out along scales presented as a line divided into 20 equal intervals anchored by bipolar descriptors (e.g., high/low). (Hart and Staveland 1988)

Summary of the subjective criteria

Table 2.4 summarises the dependent and independent variables of usability, computer attitude and workload scales discussed above. Usability is apparently conceptually somewhere between computer satisfaction and cognitive workload, combining aspects of both. Independent variables range from general overall ideas such as affect and frustrations to concrete beliefs concerning e.g. installation and documentation. The emphasis in the subjective usability criteria is clearly on assessing how well the systems succeed in avoiding usability drawbacks.

Table 2.4 – Summary of the subjective usability criteria. Table presents ( x ) the dimensions that are explicitly mentioned in the references and ( o ) the ones implied by the scale items as interpreted by the author.

Dependent variables EUCSI TAM SUMI QUIS PSSUQ TLX
Satisfaction x          
Attitude towards use   x x      
Actual use   x        
Usability     x x x  
Cognitive workload           x
 
Independent variables EUCSI TAM SUMI QUIS PSSUQ TLX
Satisfaction       o    
Affect     x   o  
Mental effort   o       o
Frustration         o o
Perceived usefulness o x     x  
Flexibility     o      
Ease of use x x o   o  
Learnability   o x x o  
Controllability   o x      
Task accomplishment o o o   o o
Temporal efficiency   o o   o o
Helpfulness     x      
Compatibility     o      
Accuracy x          
Clarity of presentation o     o    
Understandability     o o o  
Installation       x    
Documentation     o      
Feedback       x    

The dimensions of the subjective usability questionnaires are produced as results of factor analysis. The items that the subjects assess to be related are joined to form distinct usability dimensions. The subscales are named afterwards. The scale labels are secondary in the development of the scales, but when the scales are used the labels influence the conclusions that will be drawn. The subscale labels also identify the dimensions of usability, thus giving an operational definition of the concept. The scale names are essential to a consideration of the validity of the measurements. If there is a reliable measurement of a quality of use, but the quality cannot be defined and named, and it is not known why the items are related, it is not of much use.

The subjective usability scales are subject to conflicting requirements. The subject's overall assessment of the product and its use are the relevant dependent variables that should be optimised in the design, and subsequently measured. The items used in measuring have to address issues that are salient and meaningful to the user. On the other hand, the development team's means of influencing the subject's experience are limited to the adjustment of some concrete attributes, the levels of which the users are unable to link with their goals and values. The aims and means of design for usability are conceptually very far apart from each other. Some mediating concepts are needed between satisfaction and product properties. These are provided by computer attitude scales and usability inquiries, but none of them is able to link the whole chain of issues from user interface properties via usability to overall preference.

The usability questionnaires do not pay much attention to the influence of differences in users' motivation. A scale that performs well in estimating the perceived usefulness of a system is perhaps not valid if the user is mainly intrinsically motivated. None of the usability scales enables the subjects to rate which of the dimensions are important from the point of view of the current product and interaction (compare with NASA TLX). Topics such as first person experience, engagement (Laurel 1991), and emotional usability (Logan 1994) might be more appropriate. The next chapter introduces approaches which illuminate ideas of this kind.

Usability and emotions

Usability considerations have been dominated by the extrinsically motivated idea of information technology use. Products are regarded mainly as tools, and good usability equals the absence of usability defects (e.g. Kanis 1997, Hollnagel 1997). However, alternative ideas that emphasise the hedonic side of computer and smart product use and possession have lately emerged. Quite a number of ideas have been presented from varying perspectives, including emotional usability, fun, intrinsic motivation, engagement, sensuality, pleasure with products, apparent usability, etc. Often these concepts are not too well related to each other, or to other dimensions of usability and interfaces. A short review of these approaches is presented next.

The relationship between dimensions of extrinsic and intrinsic motivation, i.e. perceived fun and computer anxiety, is studied within information system science. Perceived fun can be regarded as "the extent to which the activity of using the computer is perceived enjoyable in its own right" (Davis et al. 1992). It can be characterised by how rewarding, pleasant, fun, enjoyable, and interesting the interaction was experienced to be (Igbaria et al. 1994). Computer anxiety is "the tendency of individuals to be uneasy, apprehensive, or fearful about current or future use of computers" (Igbaria and Parasuraman 1989). Igbaria et al. (1994) consider perceived fun as very influential in users' acceptance of new technology.

One of the most often cited views on the hedonic side of usability is presented by Logan (1994) and Logan et al. (1994), who divide usability into behavioural and emotional dimensions. The emotional dimensions of product usability attract the consumer's attention, enable learning by exploring and relieve computer anxiety. The context of Logan's discussion is the design of consumers' total experience – mindesign – in Thomson Electronics' consumer products. Mindesign is an effort to join graphic, industrial and software design to produce a total product experience for the user.

The user's engagement with interaction is discussed by Brenda Laurel (1991). Engagement refers to the user's feeling of being in control of the interaction. Laurel writes about computer fiction, games, etc. The idea addresses the subject "I" who interacts in a virtual world. There should be nothing to mediate the communication between the user and the system. "I do, what I myself want and feel involved in what I am doing." Laurel suggests that engagement is influenced by the frequency of interaction, the range of possible alternatives available for selection at one time, and the effectiveness of the inputs. Engagement can be seen as the subjective counterpart of directness, which is a property of the systems.

Hofmeester et al. (1996) have studied the sensuality of interactive objects with a pager as their example. The pager is considered a good example because it is a product potentially susceptiple to evaluation on a sensual basis. It is worn close to the body and is related to personal communication. Hofmeester found several dimensions that characterise sensuality. However, the interpretation of these dimensions for product design purposes requires a remarkable degree of designer's intuition. In Hofmeester's approach sensuality was regarded mostly as a property of appearance and presentation. The sensual qualities of interactive operation were not dealt with.

Kim and Moon (1997) have studied the possibilities of designing emotionally appealing interfaces for cyber banking. They have found trustworthiness, symmetry, sophistication, attractiveness, awkwardness, elegance, and simplicity to be relevant dimensions in defining the emotional appeal of an interface. They also related the dimensions with concrete user interface attributes. Their short conference paper does not, however, give a very clear picture of their results.

Jordan (1997a) defines 'pleasure with products' as "the emotional and hedonic benefits associated with products". He considers pleasure with products to be characterised by four dimensions.

Jordan suggests there is a need to quantify product pleasure in a corresponding way to how traditional behavioural usability has been quantified. This makes it possible to raise the importance of pleasure in the definition of product specifications.

Jordan (1997b) does not consider pleasure to be a dimension of usability, but an aspect of product experience and product evaluation that goes beyond usability. Products have to be usable. Pleasure is the determinant criterion after usability problems have been solved. The dimensions of pleasure are security, assurance, confidence, pride, excitement, and satisfaction. They are associated with the following product attributes: functionality, usability, aesthetics, performance and reliability (Jordan and Servas 1996). It seems that pleasure has quite functional origins on the product level. The ideas Jordan presents are not novel from the marketing point of view. In the context of HCI, however, it is useful to remind ourselves that usability is not the only relevant factor. Shackel (1991) already recognised the importance of likeability in relation to acceptance.

Usability considerations are often seen as relevant only after the product has been used. This is apparently based on the idea of usability as an objective measure of use of which the user's subjective assessment is only a small part. Nielsen (1993) mentions that approachability might be one aspect of satisfaction. Nielsen expects low correlation between approachability and actual usability. Caplan (1994) recognises apparent usability as an important design aim. He defines it as "the ease of use that is perceived by a customer upon first looking at a product, but not using it" and actual usability as "the ease of use experienced during operation of the product." Neither of these references analyse the concept of apparent usability in detail.

Kurosu and Kashimura (1995) and Tractinsky (1997) have searched for the determinants of apparent usability using 26 different cash dispenser (ATM) interface sketches as examples. They have found that concrete dimensions of inherent usability (location of the display, type of keypad, grouping of keys, sequence of keys, location of keypad, location of confirmation key, location of cancellation key) are not effective in explaining the subjects' assessments of apparent usability. Instead, the aesthetics of the interfaces as rated by the subjects correlates very well with apparent usability (r=0.589 in Japan and r=0.921 in Israel). Knowing the loose relationship between concrete product attributes and subjective overall assessments, it should be no wonder that two subjective assessments of an abstract quality correlate better than one of the assessments and concrete attributes.

Utilitarian and hedonic motivation has long been an issue in marketing. Hierarchy of needs is presented in marketing textbooks, beginning with Maslow. Emotional personal relevance (e.g. Holman 1986) is essential to understanding consumer behaviour in relation to different products. Affect, emotion, pleasure etc. are discussed within the broad discipline of HCI instead of usability, alongside usability, beyond usability, within usability. However, its weight in the discipline has been marginal. The hedonic dimensions of use seem to be related to short-term interaction with the products, or to interaction with some special applications, mostly games.

Utilitarian and hedonic sides of evaluation have become more essential in considerations of usability as usability has become a highly comprehensive concept and the links between consumer decision-making and usability have been noticed and intentionally created.

Summary

This chapter discussed usability as a research and design discipline, as a set of product properties, as quantitative objective criterion of interaction, as a subjective criterion of interaction and as an emotional evaluation of a product and its use. From the user's perspective the order of presentation has moved from peripheral matters to essential personal considerations. These angles of approach make it clear that usability is a multidimensional and diverse concept. Different definitions relate usability to different concepts, but the core of the concept remains elusive. Usability shares the quality of many, if not all, abstract concepts. They can be related to other concepts, but hardly to any specific concrete qualities. Usability is regarded here as being defined by methods, measures, design objectives, etc. that are related to the following topics. The various perspectives on usability are due to different ideas concerning the referent of the concept, its scope, specificity, and objectivity. The referent of usability refers to the different approaches to usability as suggested in this chapter. Depending on the context, usability may be a property of a development process, product, interaction, the user's experience, or expectations. The qualities may cover topics such as iterative development processes, user participation, consistence of the interface, quality of representation, results and resources required in interaction, the user's beliefs and feelings. The scope of interaction addresses the components that have been considered in the user-tool-task-environment system. Some approaches strongly urge the importance of paying attention to all possible contextual variables. Others focus on the core: human-computer interaction. These take the usefulness of the product for granted. A part of the discipline aims at generic usability principles and measures. Others consider the human-computer interaction to be so context dependent that useful information can be obtained only with specific measurements and observations made case by case. Technology can be mastered but the human component may change in an unpredictable way as even minor variations appear in products and tasks. Finally, the importance of subjective versus objective usability methods and the nature of the subjective aspect are understood in a number of ways. Usability engineering attempts to operationalise usability for measurement with a few objective scales. Where the human aspect of the interaction is involved, this is done with inquiries that generalise the answers given by a representative sample of users. The subjective experiences are thus turned into an objective usability index. The other end of the continuum, not too well represented, deals with user engagement – the subjective feeling of being involved.

The pages about product usability:

  1. Product Usability
  2. Usability of Interactive Products (this page)
  3. Methods of Research and Development of Usability
En Español  In Finnish   Contents

August 3, 2007. Originally published as chapter 2 in: One-dimensional usability - influence of usability on consumers' product preference by Turkka Keinonen. UIAH publication A21. Helsinki 1998.
Original location: http://www2.uiah.fi/projects/metodi