2.7. Methods to help evaluation and decision-making

Meister and Rabideu (1965) have proposed a classification of data gathering methods into three types (presented by Sinclear 1995): (1) observational, (2) database and (3) subjective methods (Table 1). The thing in common to observational methods is that some degree of formal objective measurement is involved; the idea is that you are reaching out to reality as directly as possible. Database methods constitute the historical domain where records of one sort or another or the collected, conflated and disseminated wisdom of others available in books, journals, reports and so forth are used. The main problem is that the available information has seldom been collected for the kinds of purpose that ergonomists have in mind and important types of data either have not been collected or have been lost during the process of data collection. The third class, subjective methods, contains methods that draw their data from the psychological contents of people’s minds.

Table 1. Classification of data gathering methods, modified from Meister and Rabideau (1965 presented by Sinclear 1995).

Observational methodsData gathering methods
Database methodsSubjective methods

· experimental methods

· laboratory experiments

· simulations and games

· field experiments

· observational studies

· consulting books, journals, etc.

· studying system records

· obtaining advice from experts

· questionnaires and interviews

· rankings and ratings

· critical incident techniques, etc.

The above classification shares a number of features with Fig. 6 and chapter 2.5. Observational data gathering includes various objective assessments, i.e. measurements and observational categorisations. The crucial step in all design cycles is decision-making, e.g. “test alternatives against requirements” in Fig. 3 and “evaluate designs against requirements” in Fig. 4. As already shown in chapter 1.2, where general evaluation was described, the evaluation processes applicable to design and R&D processes are one of the core issues of this thesis. Evaluation in practical design procedures is meant to determine the “value”, “usefulness” or “strength” of a solution with respect to a given objective (Pahl & Beitz 1988). Evaluation involves a comparison of concept variants or, in the case of a comparison with an imaginary ideal solution, a “rating” or degree of approximation to an ideal (Pahl & Beitz 1988). Roozenburg and Eekels (1995) give some typical characteristics of design decisions:

Decision methods aim at helping people to make decisions. The determined values and the effectiveness of an alternative with regard to a criterion have to be aggregated into a score for the overall value of the alternative. There are ordinal and cardinal methods for that, which are also called qualitative and quantitative methods. In ordinal methods, the decision-maker ranks the alternatives per criterion on an ordinal scale, to find out the alternative that satisfies the criterion and the criteria in the order of their importance. In cardinal decision methods, the decision-maker must quantify his/her judgements on the effectiveness of the alternatives and the importance of the criteria on an interval scale. The scale factors are usually considered a measure of the importance of the criteria; they are therefore usually called weighting factors. (Roozenburg & Eekels 1995)

The following chapters will describe shortly the methods applied in this study context to facilitate decision-making by designers. More extensive descriptions of the different methods are available in the original papers. In the discussion, some other methods will also be mentioned, including methods that were not used in the original papers.

2.7.1. Unit measurement by an instrument (I, II, III, IV)

It is important to design to fit the physical dimensions of people when designing such objects as seats and seated workstations. Anthropometry deals with the measurement of the dimensions and certain other physical characteristics of the body, such as volumes, centers of gravity, inertial properties and masses of body segments. When anthropometric data are used for designing something, the data should be reasonably representative of the population that will use the item. When items are designed for specific groups, such as the elderly, the data used should be specific to such groups in the country or segment in question. Apart from anthropometry, biomechanics, physiology and senses also set measureable requirements which have to be met.

An essential precondition for managing well at work or at home is that the environment and the technological devices fit one’s anthropometry. The importance of work surface heights has often been emphasised (Sanders & McCormick 1993), and it was selected as an ergonomic feature to be especially analysed in some papers of this study.

Chair also can be considered as a task-surface height. Since industry generally aims at volume production, chairs are often designed based on average anthropometric data derived from the general population (Weiner et al. 1993). However, the elderly differ in their anthropometric dimensions compared to the younger population. Unit measurements were therefore needed to characterise individual users and populations in many ways.

As far as products are concerned, many other features than height can be measured with standardised, often SI, unit scales, including the product’s mass, dimensions and the characters on the screen of a product as well as the force needed to switch on a product by pushing a button. Tasks can be measured with objective units, such as the time needed or the number of errors made. Environments can also be measured by using lux or centigrade scales.

The results of unit measurements are universally comparable. They are therefore a good basis for developments of usability engineering towards more general applicability, not only in benchmarking with one’s own old product models or with competitors" comparable models.

2.7.2. Rating (II, VII)

Rating scales are used to assess the attitudes of the users to specified product attributes (McClelland 1995). The method is commonly used to assess comfort, degrees of convenience, ease of use, perceived degrees of difficulty, and so on. In a rating method, the subjects rate each object on a scale of i-n in view of some characteristic (Cushman & Rosenberg 1991). Rating is a simple method and especially good when the number of objects or subjects is great. It provides information on the perceived ease of reading with the assessed values, but is less sensitive to the differences between them (Sinclair 1995 as an example regarding fonts). Rating methods yield quantitative scores on interval scales.

2.7.3. Ranking (VI)

In a ranking method, subject rank objects in an order from the best to the worst according to some attribute (Cushman & Rosenberg 1991). Ranking scales are used to indicate the relative order of a set of conditions or products according to a specified attribute (McClelland 1995). There are different techniques for ranking alternatives and criteria. The best known is paired comparison (cf. 2.7.4). This technique can help an individual decision-maker to reach a consistent ranking which agrees with his preferences, even when the criteria are not strictly defined. Another benefit of paired comparisons is that different persons can be involved in the evaluation process and that the extent of agreement with regard to the ranking can be determined by calculating rank correlations (Roozenburg & Eekels 1995).

In ranking, the entities should be presented in a random order for each subject involved (Sinclear 1995). It is commonly accepted that ranking methods only record real preferences for the first two or three ranks, and the last two or three ranks. Nine is usually taken to be the rough upper limit of the number of objects to be ranked.

Ranking is an ordinal decision method. It is a simple and reliable way to assess the effectiveness of alternatives and the importance of criteria. It does not set any high demands on the judgmental capacity of the decision-maker and is also properly implementable in situations in which only qualitative information on the properties of the alternatives is available. (Roozenburg & Eekels 1995)

2.7.4. Paired comparison (IV, V)

In paired comparisons, stimuli, i.e. concepts or products, are presented in pairs, and the subject is asked to choose the stimulus with the greater amount of some attribute (Cushman & Rosenberg 1991, Siegel & Castellan 1988). The results can be combined into preference matrixes, which are called dominance matrixes by Pahl and Beitz (1988). The paired comparison method used in this study was Mitchell’s (1992) method. In this method, the product can be described as a combination of different criteria (i.e. product requirements). In the first phase, the subject compares each criterion with every other criterion by pairs. The criterion that is more important gets one vote. The votes given by the subjects can be used to calculate the total criterion vote and the weighted criterion values expressed as percentages. In the second phase, there are different alternatives or models of the target product, and each subject compares each product against every other product by every criterion. After this, the final score for each product can be calculated and the most preferred product obtained.

The use of paired comparisons is appropriate to an applied engineering environment only when all the concepts to be compared meet the specifications and requirements, when significant constraints exist with regard to the characteristics of prospective participants, and when a decision must be made about competing interests. It would be less appropriate, or even inappropriate, in a research and development environment where it is possible to use more sensitive measures and to gather interval data. (Mitchell 1992)

2.7.5. Conjoint analysis (III, IV)

Conjoint analysis is primarily a market research method. Consumers are shown a set of hypothetical products, which differ from each other in certain attributes, and are then asked to rank or rate them. From the rankings or ratings, the importance of each attribute and the most effective combination can be determined (Kotler 1997). In conjoint analysis, whole product profiles are presented to the customers instead of a list of independent questions or choices. In this way, the purchasing situation can be realistically simulated, unlike with traditional market research methods (Gustafsson 1993). The conjoint analysis methodology is useful in the development of new products or concepts, pricing, market segmentation, advertising, competitive analysis and distribution (Gustafsson 1993). Conjoint analysis in ergonomics, as in the papers II and III, has been also used by Miller et al. (1996).

Conjoint analysis is also used in product development as a concept screening method. By concept screening, one tries to identify the best choice out of a number of given concepts (Roozenburg & Eekels 1995). The goal of the concept test is not only to find out the extent to which consumers appreciate the concepts, but also which potential combination of attributes is preferred, so that the further development of the concept can be aimed towards that direction. The concepts screened can be different as far as ergonomics is concerned. Paper V screened concepts, though not by conjoint analysis.

There are two basic methods of conjoint analysis: the two-factors-at-a-time procedure, also referred to as the “trade-off“ procedure, and the multiple-factor procedure, i.e. the “full-profile“ approach (Green & Srinivasan 1978). The method used in the papers III and IV was a full-profile approach, which involved presenting the respondent with a set of product descriptions such that each description contained information at the level of each attribute. An orthogonal array (an experimental design in which the combinations to be tested are selected in such a way that the independent contributions of all the factors are balanced) can be used to simplify the situation. The respondents either rank the resulting product descriptions or rate them on a relevant dimension, such as preference or purchasing likelihood. The presentation format of the complete descriptions can range from cards to drawings and actual products, pictures, etc. (Tull & Hawkins 1993)

2.7.6. Use-value analysis (V)

In decision analysis, the problem lies in the weighting of the good and less good properties (Roozenburg & Eekels 1995). Designers and marketing experts are said to have a tendency to implement all possible features if - as often nowadays - no extra manufacturing cost is involved, only programming being required. Consumers also often want to get all the new functions they can, because the number of features is often regarded as an indicator of product quality (Norman 1988). However, it has also been pointed out that customers use only about 10% of the functionality of today’s high-end televisions on a regular basis (de Vries et al. 1994). That is one reason why the weighting of product features is important. When the features are ranked based on the weightings by users, the most preferred features can be chosen for development and included in the manufactured products.

In the use-value analysis method (also called the German method in paper V), an expert creates and weights the product characteristics, after which different products can be rated to give an overall value to the product (Pahl & Beitz 1988). This method is based on the systems approach and the combined technical and economic evaluation technique specified in Guideline VDI 2225 (Pahl & Beitz 1988).

The first step, as in any evaluation, consists of drawing up a set of objectives from which evaluation criteria can be derived. The objectives should be as independent of one another as possible. Evaluation criteria can be derived directly from the objectives, i.e. the requirement specifications. All criteria must be given a positive formulation. Use-value analysis systematises the creation of the set of criteria by means of an objective tree, in which the individual objectives are arranged in a hierarchic order (paper V, Fig. 2). This hierarchic order simplifies the assessment of the relative importance of the sub-objectives.

The second step is to assess the relative contribution (weighting) of the evaluation criteria to the overall value of the solution. The next step is the assessment of values and hence the actual evaluation. The values are expressed as points, which are determined by utilising scales, whether based on unit measurements or on subjective ratings (cf. 4.3). Use-value analysis employs a range from 0 to 10, whereas Guideline VDI 2225 involves a range from 0 to 4. When the sub-values for every variant have been determined, the overall value of each variant can be calculated. The comparison of concept or product variants is done based on which has the maximum overall value, or the overall value is referred to as an imaginary ideal value, which results from the maximum possible value.