Constantine Stephanidis and Michael Sfyrakis
Introduction
A man-machine interface can be defined as the mediator between users and machines. It is a system that takes care of the entire communication process, is responsible for the provision of the machine "knowledge", functionality and available information, in a way that is compatible with the end-user's communication channels, and translates the user's actions (user input) into a form (instructions/commands) understandable by a machine.
As increasingly complex systems, products and services appear in the market, the necessity for more user friendly man-machine interfaces is becoming progressively more crucial for their utilization, and consequently for their market success. Graphical user interfaces, audio based interaction, speech synthesis and understanding, natural languages, direct manipulation, and multimodal interaction dialogues, as well as ergonomics and human factors evaluation have all contributed to the evolution of more powerful, complex and demanding user interfaces. Recently introduced concepts are expected to have a profound impact on future developments in man-machine interface technology. These include user-tailored environments (interfaces compatible with the user's available communication channels); user-support environments (capable of assisting the user during the interaction process); 3D direct manipulation; multimodal-multimedia interfaces (providing concurrent interaction by means of different media); alternative interaction techniques and input/output devices; virtual environments (by means of the realization of interaction environments far beyond the physical capabilities of the user and the terminal platform); and co-operative and collaborative environments (that provide concurrent access to the same environment by several, locally-based or remotely located users, allowing co-operation and collaboration between them).
Potential users of computer systems and tele-communications products will be required to use the emerging technology, functioning and communicating in a multimodal, multiprocess, and co-operative environ-ment. Disabled and elderly users may have problems in accessing interface entities that use modalities inaccessible to the particular user, or in accessing interfaces that demand high cognitive and interaction abilities. On the other hand, the availability of interface entities in redundant forms (messages, selection menus, workspaces, etc.), the availability of a variety of interaction techniques that utilize alternative modalities (e.g. selection, position, quantity), the selection of adequate interaction metaphors (e.g. the desktop metaphor that is utilized in most of the currently available graphical user interfaces), and the introduction of intelligence in the man-machine systems, can contribute to the provision of interfaces tailored to the user's abilities and preferences, if the requirements of disabled and elderly users are considered at an early stage of the design.
This section addresses some of the recent technological advances that influence the design and development of the man-machine interface, while at the same time identifying the possible impact on disabled and elderly people.
Graphical User Interfaces: The Next Generation
The relatively brief history of graphical user interfaces has already been marked by an increasing tendency toward "realism" in the representation of interface entities (Staples, 1993). Multiple media will be used to represent realistic metaphors in the machine environment, virtual worlds will get closer to reality, and new alternative input devices and interaction techniques will be provided to the user to accomplish the various tasks in an environment that supports multiple processes. The next generation of user interfaces will involve gesture and character recognition, speech and natural language; they will be "dynamic", "spatial", 3-dimensional, "virtual", colourful, and "intelligent" (Marcus, 1993; Nielsen, 1990; Frishberg, 1993).
The interaction metaphors will be closer to reality, providing means of interaction similar to the user's everyday tasks. Virtual reality will significantly influence the realization of the selected metaphors, while multimodal interaction will be used to fulfil the diversity of requirements regarding the type and form of interaction due to the different culture of the users and the simultaneous realization of different interaction tasks. Regarding the input and output technology, visual presentation will remain the main output channel, providing 3D output enhanced by auditory presentation in forms of non-speech audio or speech, while on the input side text input and pointing devices will be significantly augmented by speech, gesture and kinaesthetic input (data glove). The advances in video storing and manipulation techniques will enable the utilization of the video channel in the graphical user interface, while the evolved hypertext/hypermedia architectures will allow non-sequential provision of interface functions and information, thus further enhancing the usability of the graphical user interfaces.
3D Direct Manipulation
Direct manipulation has contributed to the usability of modern user interface by allowing the user to perform actions directly on visible objects. Metaphors from the real world have been used to model the application or the system behaviour in the two dimensional space of the screen, using "virtual" objects (icons, forms, messages, etc.) taken from the objects in the manual world. The manipulation of the virtual objects is analogous to the manipulation of the real-world objects.
Three dimensional interfaces are increasingly available in the market, providing a realistic view of the functionality of the applications and systems, utilizing 3D objects. Unfortunately, their usefulness is restricted by the lack of input devices that operate in three dimensions. Newly introduced input devices and interaction techniques (Venolia, 1993) such as the 3D mouse, data glove, 3D cursor controlled by an augmented mouse and 3D interaction techniques, allow for more casual direct and natural manipulation of 3D objects than by those presently available.
Alternative Input - Output Devices
Following the evolution of more realistic user interfaces, new input and output devices have been introduced to allow users to interact with the resulting new environ-ments. Larger high-definition screens for graphical presentation, animation and the provision of high-quality video, personal displays that are placed in front of the eyes and provide on-line support information (Very Small Aperture Terminals), and head-mounted displays for the presentation of 3D images used in virtual reality systems are some of the display screens that may be used routinely in the future. Moreover, output devices for the provision of information in alternative modalities, such as speech, sounds, Braille, signs and gestures are becoming increasingly available in the market in order to cover the needs for multimedia information provision. Devices and systems that provide audio information, either as speech (e.g. speech synthesis), or sounds and music tones (one- or two-dimensional non-speech audio), tactile output in terms of Braille code, or sign language output are examples of the emerging output technology.
On the other hand, adapted and alternative input devices and techniques allow the utilization of the user's speech, kinaesthetic, and motor communication channels, providing new ways of communication, and facilitating access to the evolving multimedia systems. Adaptations for disabled people of currently available input devices include, for example, half-QWERTY Matias, 1993) and chord keyboards, which support one-hand typing and faster communication, and various versions of pointing devices such as mouse and track-ball facilitating function in a 3D environment. New input techniques include gesture-based input (Zhao, 1993), handwriting (pen-based input), sign language (Myers, 1993) and voice input that allow hands-free operation, as well as special input techniques, such as eye-gaze, and a variety of switches, for supporting users with limited motor control in accessing graphical or virtual interfaces.
Multi-modal Interaction
Multimodal interaction enables the user to employ different modalities such as voice, gesture and typing for communication with computer or telecommunications terminals. Concurrency of processing enables the parallel utilization of the various communication channels, while data fusion allows the combination of different types of data to facilitate an input or output task (Nigay, 1993; Gaver, 1993).
In conventional face-to-face communication, many channels are used and different modalities are activated. Conversation may be supported by multiple co-ordinated activities of various cognitive levels. As a result, communication may become highly flexible and robust, so that failure of one channel may be recovered by another channel, and a message in one channel can be carried by the other channel (Takeuchi, 1993). Similarly, in multimedia communication, concurrent processing and recovery in failures could be provided to the user, introducing additional flexibility in the interaction process. Moreover, possible perception problems of the user regarding a specific medium could be overcome through the utilization of redundant presentation forms in alternative modalities that are compatible with the user's available communication channels (RACE II, 1992). Speech (recorded or synthesized), auditory icons, signs, gestures, video objects, and pen-based input are examples of interface entities that utilize alternative input/output channels.
An example of multimedia interface is provided by VoicePaint (Nigay, 1993), an application which is a graphics editor implemented on the Macintosh using Voice Navigator, a word-based speech recognizer board. While drawing a picture with a mouse, the user can talk and ask the system to change the attributes of the graphics context (e.g. the pattern). The assistance provided to the user by the availability of additional media (in this case speech) contributes to the parallel execution of different tasks and the recovery of possible failures made during the drawing process.
Audio Based Interaction
The auditory system has a number of characteristics that make it amenable to human-computer interaction, allowing a more physical way of communication, but is under-utilized in most current interfaces (Brewster, 1993). Sound can be heard from any direction without the need to concentrate on an output device; auditory messages can be provided in parallel with visual output, and speech input can be provided directly by the user's voice, thus providing greater flexibility and hands-free operation.
Similar to the visual metaphors used in graphical user interfaces, that are based on interface entities such as icons, windows, menus, etc., in audio-based interaction the concept of earcons and auditory icons has been introduced to represent interface entities and events occurring during the interaction process.
Earcons are defined in (Brewster, 1993) as abstract synthetic tones that can be used in structured combinations to create sound messages, either to represent parts of an interface, or as non-verbal audio messages that are used in the computer-user interface to provide information to the user about some computer object, operation or interaction. Earcons are composed of motives, which are short, rhythmic sequences of pitches with variable intensity, timbre and register. They can be combined to produce complex audio messages; for example, simple earcons such as "open", "close", and "file" can be combined to produce earcons for "open file" or "close file".
On the other hand, auditory icons (Gaver, 1993) are well-suited for providing information about previous and possible interactions, indicating ongoing processes and modes, and are useful both for navigation and supporting collaboration. In common with visual icons, they not only reflect categories of events and objects, but are parameterized to reflect their relevant dimensions as well. That is, if a file is large, it sounds large; if it is dragged over a new surface, that new surface is heard; and if an ongoing process starts running more quickly, it sounds quicker. Auditory icons add valuable functionality to computer interfaces, particularly when they are parameterized to convey dimensional information. They convey information about events in computer systems, allowing users to listen to computers as they do to the everyday world.
In addition to the definition of audio-based metaphors the input and output part of the interaction should be based on audio-based systems and devices. Natural language production and understanding can be considered as the most adequate way of communication, but the available speech technology (speech synthesis, speech recognition and speech interpretation systems) are limited to small vocabulary and low-quality speech systems that are able to operate on specific domains, and require user training (see chapter 4-7).
An example of an application that provides an audio-only interface is VoiceNotes (Stifelman, 1993), a voice-controlled hand-held computer without a visual display, that allows the creation, management, and retrieval of user-authored voice notes (small segments of digitized speech containing thoughts, ideas, reminders or things to do).
User Interfaces for International Use
As computer and telecommunications terminal markets become more international, a new requirement is emerging for user interfaces that are able to cover the linguistic and cultural characteristics of all potential users. Today, an increasing number of computer and telecommunications products are being developed for international use (Russso, 1993). Several problems need to be overcome in order to provide the meaning of the various interface entities in another language or for a particular target population. The formalism of the written language such as horizontal or vertical writing, numeric, date, and currency format, the use of colours, the perception of icons and symbols are problems that are commonly faced in designing interfaces for international use. For example, red usually represents danger in western cultures while Chinese use red to represent happiness (Russo, 1993). Regarding icons and symbols, the icon of a mailbox or letterbox used to represent electronic mail can vary greatly from country to country or among the target user groups (Nielsen, 1990).
Knowledge-based, Intelligent User Interfaces
In order to provide a more human-like user interface, much current research deals with equipping the user interface with some intelligence. An intelligent user interface should communicate with the user in a more natural way, would be tolerant to input mistakes, and would understand the user's goals rather than just executing the user's commands (RACE I, 1989) Moreover, an intelligent interface should complement human capabilities by identifying and augmenting shortcomings in the ways that people already use to structure and carry out their work. It should also be able to adapt and learn (RACE I, 1989; Desmarais, 1993; Sullivan, 1991).
Natural language processing, voice-based interaction, multi-modal communication, help systems, real-world interaction metaphors, domain and user models, knowledge sources, and rule-based systems are issues examined in the development of intelligent user interfaces. The essence of such intelligent systems are embedded knowledge regarding the user, the particular domain, and the available resources, and rule-based systems that evaluate the specific user and domain characteristics and select adequate interaction and presentation techniques for the provision of interfaces accessible to the particular user.
Cooperative and Collaborative Environments
The development of communication networks is changing the way in which people work and communicate, allowing cooperation among people who are physically in remote locations. Co-operative and collaborative environments will allow group work, joint effort in writing documents, group conferencing, and will provide means of interaction in multiple modalities with remotely-located users. These changes are reflected in the emerging user-computer interface technology which is used to support the evolving requirements for simultaneous interaction with more than one user (tele-conferencing), execution of multiple applications (multi- tasking), remote education, tele-working, etc. In order to support collaboration and remote computing, new metaphors and interaction techniques that reflect the needs for both of the communicating parties are being developed, and new products (user interface systems), which will fulfil the above requirements, are appearing in the market (Greenberg, 1991; Gaver, 1993).
User-tailored Environments
The introduction of intelligence in the user interface systems enable an evolution towards user-tailored environments that can be adapted according to the particular user's abilities, requirements and preferences. Implicit knowledge regarding the user, the domain, and the available resources, and embedded intelligence in the user interface system will provide each machine or system with multiple interfaces accessible by the various user categories. The level of adaptivity that is provided by the system as well as the control of the adaptations (user or system initiated) are issues that differentiate the existing prototypes and systems.
An adaptive user interface relies, to a large extent, upon an adequate user model (Desmarais, 1993) (e.g. a representation of user characteristics, abilities, and expertise). How user characteristics relate to the user interface aspects is decided by a rule-based system (an "intelligent agent") that can use inference upon the user model and the available knowledge bases, and specify the resulting interaction environment (adapted to the user preferences and abilities). The knowledge sources that are usually needed concern knowledge of the user and the user's tasks (user model), knowledge of the tools, the domain, and the interaction modalities, and knowledge of how to interact (interaction techniques). In order to provide the resulting interface, input and output devices that utilize alternative modalities compatible with the available user communication channels are utilized.
User Support Environments
The complexity of the systems and applications with which the user has to communicate, require the introduction of systems capable of providing full on-line help and real-time assistance to the user on his request, in order to accomplish specific tasks or to operate a specific machine. The diversity of the characteristics and the requirements of the various user categories add subsequent requirements for user assistance during the interaction process. The tasks and services that a user- support system provides to the user are to carry out menial tasks, to automate routine tasks, to assist with more complex tasks, to enable easy access to tools, to supply status information, and to allow for on-line assistance and documentation, etc. (RACE I, 1989; Neches, 1993).
Discussion - Conclusions
Today, man-machine interface systems tend to mimic the real-world environment, driven by a combination of technology advances and application demands (Robertson, 1993). On the technology side, advances in interactive computer graphics hardware, processing power and low-cost mass storage devices, 3D displays and virtual reality, sound and video technology, hypertext architectures, and intelligent agents have created new possibilities for the development of intelligent interfaces that support multiple communication channels, and allow concurrent execution of different applications. On the application side, the increasing masses of information, the multitasking nature of the communication, and the diversity of users, have created demand for systems that are capable of controlling and adapting the man-machine dialogue, according to the particular user's abilities and preferences.
In the process of providing interfaces that emulate real-world environments considerable emphasis has been given to the visual representation of the interface entities. 3D graphics, photorealistic images, spatial representation by the utilization of perspective, light, shadow, transparency, and opacity, as well as additional interaction techniques based on kinaesthetic input are some of the tools and techniques used to reproduce 3D real-world metaphors in the machine environment. Audio technology has significantly contributed to the enhancement of the representation ability of the interfaces by introducing sounds, auditory icons, and natural language processing as new interface entities, in order to represent spatial and dimensional characteristics, or to provide a more natural way of communication.
The resulting user interfaces will provide a virtual space, which will enable the user to interact in a more natural way, providing input (acting) and perceiving feedback by utilizing all the available senses and communication channels. Vision, touch, motor move-ments, hearing, and speech will be used in parallel during the interaction process. Although the basis of the resulting interface will be the reproduction of real-world concepts and metaphors, functioning in a virtual space allows the introduction of new concepts (objects and actions that do not have an analogy to the real-world) in the interface model, enabling the construction of interfaces tailored to the possible user's perceptual abilities and characteristics.
The forthcoming evolution in man-machine interfaces will temporarily introduce additional problems for disabled and elderly people, who will be required to communicate in a more demanding environment that supports concurrent, and multimodal interaction. Moreover, blind users will face substantial problems in accessing the evolving, highly visible, 3-Dimensional graphical user interfaces; users with limited motor control will have problems in using alternative input techniques such as kinaesthetic input, in interacting with a virtual reality interface, and functioning in a parallel interaction environment where simultaneous input and control is required; users with mental limitations will have problems in perceiving the interaction metaphors or in following the parallel execution of interaction tasks, due to the cognitive overload these may entail. Some of the factors that are anticipated to play a major role in the evolution of the man-machine interfaces and, at the same time, can provide accessibility solutions for disabled and elderly users are: the selection and realization of the interaction metaphors, the utilization of knowledge bases and artificial intelligence techniques, and the development and integration of alternative input and output devices and systems (Stephanidis, 1993).
Through the selection of adequate interaction metaphors, alternative ways of man-machine communication, compatible with the user's communication channels can be provided, utilizing the inherent multimodality of the emerging man-machine interface systems and the availability of alternative input/output devices. For example, the realization of an interaction metaphor which is based on simple concepts, utilizing multimodal interface entities (e.g. music messages, video signs, graphics symbols, etc.) can assist users with limited mental abilities to interact. Moreover, the realization of selected interaction metaphors utilizing special input and output devices and interaction techniques can assist users with motor difficulties and mental limitations in accessing complex devices and systems. The advances in audio-based interaction is expected to result in broader utilization of audio based interfaces. An interaction metaphor based on audio communication can provide access to interactive systems for blind or severely motor impaired users. Regarding the multimodal nature of emerging man-machine interface systems, the augmentation of the dialogue through the utilization of alternative media, providing redundant interaction methods, can provide accessibility to users with limitations in specific communication channels.
The introduction of knowledge and reasoning in user interface systems can provide solutions to several disabled user categories by automating the construction of the interface, providing on-line assistance to the user, or providing intelligent interfaces that can be adapted according to the particular user abilities and require-ments. Current work on user-tailored and user-support environments, as well as research on the design of user interface systems for international use, which take into account the cultural and physical characteristics and abilities of the various categories of users, can sub-stantially influence the evolution of the next generation of "intelligent" user interfaces.
Finally, recent work on co-operative and collaborative user interfaces can potentially contribute to the fulfilment of the requirements of disabled and elderly people, by allowing on-line assistance by third parties, tele-support and tele-consultancy, as well as group and tele-work. Thus, people with communication and interaction difficulties can communicate with applica-tions, systems or other users by using adequately adapted user interfaces, located at their site.
BREWSTER, S. A., WRIGHT, P.C. and EDWARDS, A.D.N. (1993). An Evaluation of Earcons for Use in Auditory Human-Computer Interfaces, INTERCHI '93 Human Factors in Computing Systems, pp. 222-227.
GAVER, W. W. (1993). Synthesizing Auditory Icons, INTERCHI '93 Human Factors in Computing Systems, pp. 228-235.
DESMARAIS, M.C., LIU J. (1993). Exploring the Applications of User - Expertise Assessment for Intelligent Interfaces, INTERCHI '93 Human Factors in Computing Systems, pp. 308-313.
FRISHBERG, N., GORAZZA, S., DAY, L., WILCOX, S. and SCHULMEISTER, R. (1993). Sign Language Interfaces, INTERCHI '93 Human Factors in Computing Systems, pp. 194-197.
GAVER, W., SELLEN, C., HEATH, C. and LUFF, P. (1993). One is not Enough: Multiple Views in a media space, INTERCHI '93 Human Factors in Computing Systems, pp. 335-341.
GREENBERG, S. (Ed.), (1991). Computer-supported Cooperative Work and Groupware, Academic Press.
MARCUS, A. (1993). Human Communications Issues in Advanced UIs, Communications of the ACM, Vol. 36, No 4, April 1993, pp. 101-109.
MATIAS, E., MACKENZIE, I.S. and BUXTON, W. (1993). Half-QWERTY: A One handed keyboard Facilitating Skill Transfer From QWERTY, INTERCHI '93 Human Factors in Computing Systems, pp. 88-94.
MYERS, B.A., WOLF, R., POTOSNAK, K. and GRAHAM, C. (1993). Heuristics in Real User Interfaces, INTERCHI '93 Human Factors in Computing Systems, pp. 304-307.
NECHES, R. et. al., (1993). The Integrated User-Support Environment (IN-USE), Group at USC/ISI, INTERCHI '93 Human Factors in Computing Systems, pp. 53-54.
NIELSEN, J. (Ed.), (1990). Designing User Interfaces for International Use Elsevier.
NIGAY, L. and COUTAZ, J. (1993). A Design Space For Multimodal Systems: Concurrent Processing and Data Fusion, INTERCHI '93 Human Factors in Computing Systems, pp. 172-178.
RACE I, (1989). Project 1065 - ISSUE, Workpackage 2, Adaptive Multi-Media Dialogue in Retrieval Services: Survey on the state of the art.
RACE II, (1992). Project 2009 - IPSNI-II, Workpackage 1.3, Definition of the functionality of Multimedia Interfaces.
ROBERTSON, G.G., CARD, S.K. and MACKINLAY, J.D. (1993). Information Visualization Using 3D Interactive Animation, Communications of the ACM, Vol. 36, No 4, April 1993, pp. 57-71.
RUSSO, P. and BOOR, S. (1993). How Fluent is Your Interface? Designing for International Users INTERCHI '93 Human Factors in Computing Systems, pp. 342-347.
STAPLES, L. (1993). Representation in Virtual Space: Visual Convention in the Graphical User Interface, INTERCHI '93 Human Factors in Computing Systems, pp. 348-354.
STEPHANIDIS, C., HOMATAS, G., KOUMPIS, A., SFYRAKIS, M., GALATIS, P. and VARELTZIS, G. (1993). Access to B-ISDN services by People with Special Needs: A methodological approach RACE IS&N Conference, Paris.
STIFELMAN, L. J. B. ARONS, C. SCHMIDT, E. A. HULTEEN, (1993). Voice Notes: A Speech Interface for a Hand-Held Voice Notetaker INTERCHI '93 Human Factors in Computing Systems, pp. 179-186.
SULLIVAN, J.W. and TYLER S.W. (Eds.), (1991). Intelligent User Interfaces, ACM Press, Addison Wesley Publishing Company.
TAKEUCHI, A. and NAGAO, K. (1993). Communicative Facial Displays as a New Conversational Modality INTERCHI '93 Human Factors in Computing Systems, pp. 187-193.
VENOLIA, D. (1993). Facile 3D Direct Manipulation INTERCHI '93 Human Factors in Computing Systems, pp. 31-36.
ZHAO, R. (1993). Incremental Recognition in Gesture-Based and Syntax-Directed Diagram Editors INTERCHI '93 Human Factors in Computing Systems, pp. 95-100.