Design is a complex process, which may involve multi-solution problems. Though many evaluation methods have been proposed in the literature on design methodology, every method has its own application to a situation or an object (Drury 1995, Hsiao 1998, Roozenburg & Eekels 1995). The selection of methods depends on the distribution of the data and on the emphasis the evaluator wishes to place on the particular aspects of distribution (Taylor et al. 1995). For example, Wilson (1995) advises the evaluator to use two or more methods to improve the effectiveness of the study or the reliability of the findings; the weaknesses of one method can be balanced off by the strengths of another. A multiple-methods study may utilise a mixture of qualitative and quantitative techniques in field and laboratory settings.
At the beginning of this study, different methods were assessed. Based on that, some of the apparently most feasible ones where chosen for experiments. The methods were compared, and it turned out that the methods can be used even one at a time. But if more reliability is needed, two methods used together can be more optimal than a single method. The methods complement each other. The group of evaluators may not always be representative enough, in which case more methods will be needed to make the test more reliable. It is recommendable to combine in the evaluation both the users’ opinions and expert knowledge of user performance. The type of evaluation that is used depends on the stage of development of the product and on the use that is to be made of the information collected. The different methods used in the experiment were developed into the EEE procedures. Table 8 shows collectively the strengths and weaknesses of the EEE procedures and indicates which procedure is useful in each instance. The relative suitability of various methods to incorporating usability issues at the different stages of the design process has been discussed by van Vianen et al. (1996) and by Stanton and Baber (1996).
Meister (1985) has listed six practical requirements for criterion measures, indicating that, whenever feasible, a criterion measure should (1) be objective, (2) be quantitative, (3) be unobtrusive, (4) be easy to collect, (5) require no special data collection techniques or instrumentation, and (6) cost as little as possible in terms of money and experimenter effort. The EEE procedures proposed in this study are cost-effective, quick to apply, and easy to schedule. The measurements of criteria in the EEE procedures are quantitative and suitable to be applied both by companies and by representatives of consumers equally well to traditional technology products, such as chairs, and new ICT products. Statistical methods can be used to help in the evaluation of significant results. The present EEE procedures can be used by special groups, for example the elderly, which are often considered a difficult group. Ergonomic knowledge introduces the factor of suitability for humans at the beginning of product development already (Fig. 4).
Fig. 16 shows the EEE procedures represented as relation to the papers (cf. Fig. 6). The following things are common to all EEE procedures: user trials with subjects, usability engineering in a measurable form, simulation of the object to be evaluated and use of statistical analysis. The separate issues specific to the different EEE procedures used in the papers are represented in the small boxes. In Table 8, the EEE procedures are presented as a more common approach applicable to industrial usability testing. With the EEE procedures, the customer perceptions are obtained on a quantitative scale and in numerical terms as perceived by the subjects (customers, users). An essential part of the EEE is the use of experiments, which are statistically designed and analysed. The statistically oriented scientific approach has been the key factor to the great success of the Total Quality Management (TQM) culture in companies. The continuous improvement of processes, products and services is carried out through procedures crucially based on “designing the experiment” and “conducting the experiment” (Logothetis 1992).
The procedures were applied with and to elderly subjects, but they can also be used for other groups of people. Elderly people are an important group, not only because their number is increasing but also because they are users with characteristics and needs typical of them. Better usability of products probably promotes safety, too, as increasing numbers of accidents have been reported (Saarela 2000). Typical accidents of older people include burns and falls, which often have serious consequences.
In the task-surface experiment (paper II), the EEE1 procedure was applied to reveal the fit between the product and the subject measures, which proved to be useful and practicable in the case of studying task-surface heights. In addition, other quantitative measurements, such as heart rate, EMG, EEG, eye movements and hand movements, obviously could and should be used more in user-centred design. The user interfaces of a product should fit to humans anthropometrically, support manually the biomechanical functions of human beings, support visually the cognitive aspect of human beings and support audibly the hearing ability. The EEE2 procedure comprises various feature ratings, both subjective and objective.
Table 8. Applicability of the EEE procedures generated in the study.
| Procedure | When to apply | Strengths | Weaknesses |
|---|---|---|---|
| EEE1 | all (measures from users, products, tasks and environments) | standardised often SI units easily obtained | instruments (though often cheap), may interfere contextual possibilities objective view, does not account user preferences and user capabilities ranking does not indicate whether the best alternative is actually easy to handle |
| EEE2 | measure of satisfaction weighting of problems and functions comparison of products | quantitative data, based on subjective perception scales, mainly for decision-making | less sensitive to differences between judged examples |
| EEE3 | first selection of alternatives | simple and quick provides qualitative data easy to eliminate inadequate alternatives | does not give absolute differences in effectiveness between the alternatives, but only the order of preference |
| EEE4 | conceptual phases at embodiment design in selection between prototypes conceptual design to find out the importance of features the best product available | straightforward each subject has equal opportunities quantitative basis for selecting from multiple significant criteria | constraints on the number of criteria and design alternatives that can be employed |
| EEE5 | conceptual design to find out the importance of features and user needs and preferences to create an optimal product | the whole is to be judged good at segmentation the number of product variables can be reduced to a minimum at a certain level of confidence excellent possibility to find optimal trade-offs | holistic approach makes evaluation by end-users more difficult constraints on the number of criteria and criterion levels that can be employed |
| EEE6 | conceptual design to find out the importance of features the best product available | objective and subjective experiences are combined into the same scale if secrecy is needed, the criteria can be weighted without prototypes with real users, and the expert can then evaluate the prototypes the number of design alternatives is not so constrained | constraints on the number of criteria that can be employed |
The EEE3 procedure pointed out that experts can also be substituted for experimentation with subjects, provided the experts are a representative group and have adequate expertise about the user. If the choice of the product depends on which is the best, ranking would be appropriate, but in the product development process, the designers usually need to know the different criteria to be able to improve the weaknesses and, ultimately, to develop an optimal product with the available resources. EEE4 made it possible for the subject to easily evaluate the products in question, since only pair comparison was needed.
The EEE5 procedure radically decreases the variables, and the current software also enables statistical analysis. It is realistic and cheap, but the results are still as good as those obtained with the other methods. With the EEE6 procedure, the scales of objective and subjective evaluation are combined.
The EEE6 procedure is notably usable in industry, where secrecy is of primary importance. The end-users can easily weight different criteria ascribed to the product by paired comparison, and no actual product need to be available to the end-users. Experts can then evaluate the different products with the criteria on the scale. Apart from the high overall value, it is important to obtain a balanced multi-criteria value profile with no serious weak points (Pahl & Beitz 1988). See chapter 6.3.7 for more details of the methodology to eliminate the weak points.
It is recommended that the following essential usability engineering steps should be taken to carry out ergonomic experimental evaluation (EEE) studies (the steps originate partly from those suggested by Taguchi for scientific quality management in Logothetis 1992):
Define the user-centred design problem: As an expert, collect data on the product in question. Provide a clear statement of the problem that the experiment is aiming to solve.
Determine the objective of the EEE study: Identify the output characteristics (responses) to be studied and eventually optimised (preferably measurable and with good additivity). The controllable and uncontrollable factors should be defined. Experts as a team formulate the features (criteria) of the product to be studied and the feature levels and the products to be created. Brainstorm if necessary.
Choose a EEE procedure or a combination of them and design the experiment: Select the appropriate experimental designs by assigning the controllable factors and their interactions. Decide who are to evaluate the object (users or experts). Select the type of communication about the products with the users.
Conduct the experiment: Perform the experimental trials and record the experimental data. With the elderly, it is good to have an experimenter to record the data.
Analyse the data: Evaluate the performance measures for each trial and analyse them using appropriate statistical techniques. Chapter 4.3.7 shows some statistical methods appropriate to EEE evaluation.
Interpret the results: Draw a conclusion of the data and define future needs.
Run a confirmatory experiment: It is recommended that confirmatory tests should be done later. Conduct interpretative studies to enhance reliability. Split in half (papers III and IV) or, alternatively, repeat the experiment with some of the users (papers II and IV). Expert evaluation rating vs. subject rating can be used to investigate reliability, as in Table 7.