Jan Engelen and Rene Besson
New and important developments are taking place in data communication offering new possibilities to convert data into the most suitable output format for a particular application. This can also lead to enhanced interpersonal communication for disabled and elderly people.
The World Wide Web (WWW), Information Superhighway and Internet have become (as have telematic applications as a whole) a day to day reality for many people. The focus of this chapter will be on the benefits disabled persons can obtain from this evolution, highlighting potential improvements and modifications to existing services and the possibility of introducing new ones.
Role of Formats and Accessibility of Data Communications
A significant group of disabled and elderly persons, who have difficulty in accessing the printed word, may encounter serious problems gaining access to the infor-mation society. This is clearly an issue of major concern.
For example, more and more information systems have become computer-based (telephone guides, railway timetables, home banking etc.). This presents reading impaired people with both an enormous challenge and an enormous opportunity. The challenge is that if print disabled customers are to continue the development of their socio-economic integration, new ways of accessing electronic information must be developed. The oppor-tunity lies in the fact that if these ways are developed, they have the potential for providing greatly enhanced access compared to those available at present.
These "New ways of access" are closely related to the format in which the documents are available. The different types of text documents are briefly outlined in relation to the ever-growing multimedia world.
Digital Texts: Formats and Accessible Formats
Text information undoubtedly still comprises a very large share of information content produced on computers today. The production and editing of textual documents by word-processing programs is one of the most significant applications of computer technology.
The importance of text communication is underlined by the recent explosion of electronic mail and electronic bulletin board activities. Very recently, however, even these traditionally text-only based networks started carrying multimedia information. "World Wide Web" (WWW) applications on Internet have exploded in the space of just a few months. This multimedia information which may include moving pictures instead of text poses a new problem to blind and visually impaired people. Whereas text can be converted into Braille or speech this is more difficult with moving pictures. A European project is specifically looking into ways of providing "hooks" for an alternative graphical user interface, shown in Figure 4-7, to alleviate this problem (Gill,1993).
A brief description of the relative merits of different text formats are given in Tables 4-5, 4-6 and 4-7, going from the simplest ones such as the American Standard for Coding Information Interchange (ASCII) to the fully structured ones.
ASCII, is a frequently abused word. It is used to code text documents in a simple way, but many different variants exist.
Types - Comments
ASCII-Text Only: Files contain the text of a document, no formatting nor graphics. Paragraphs are ended with a code, but as lines can be very long, problems with e-mail transmission are likely. Even the codes that indicate the end of a line or a paragraph are NOT standardized. Thus, DOS and Windows are using the carriage-return/line-feed pair (CR/LF), Macintosh computers use CR only and Unix uses LF only.
ASCII-Text Only with line breaks ("e-mail format"): As above but the text is now broken up in lines of approximately 70 characters in length, each ended with an end-of-line code. Preferably only 7 bit-characters should be used if transmission through e-mail is intended.
(Table 4-5) Different "simple" ASCII types.
Enriched ASCII - Comments
MAIL: MAIL messages as found on any networked computer are somewhat structured. Essential parts of the message have special starting words such as: To:, From:, Subject: etc. All mail-reading programs can handle this format and allow searching e.g. on subject items.
SETEXT: Special sequences of dashes provide visual clues to the meaning of some parts of the text. SETEXT reading programs can handle this type of structuring. SETEXTs can also be read (but the structure is then lost) by all standard ASCII reading programs such as EMACS, List, Readit, GREP etc. (SETEXT, sd).
(Table 4-6) Overview of different available ways to structure documents.
Reading impaired persons would often have to read every individual character if a document is unstructured. Therefore, structure is essential for a text file as soon as its size is above about 50 kbytes.
Several ways for structuring documents are available. An overview is given in Table 4-6.
It is possible to go much further in structuring documents if they only have to be read by special programs. Electronic newspapers for blind persons are published in several countries, an example of which is shown in Figure 4-8.
It should also be pointed that other users can potentially benefit from this service, for instance, ex-patriates can access their "home" newspapers on the same day they are issued. It can also be handy for everyone to scan a newspaper typing in keywords of particular interest.
A brief overview of the formats is given in Table 4-7. The conclusion to be drawn from this table is that a clear priority should be given to the Standard Generalized Markup-Language (SGML).
In order to make a clear distinction between word-processing and structured formats some details of the former are given in Table 4-8.
ETNA: In the UK, the Guardian newspaper as well as the reading program, is distributed, by the ETNA company. It is clearly a one-off situation, although plans do exist to switch to SGML.
RATS: The only purely commercial company in the "electronic- newspaper-for-the-blind" field is the Swedish Textalk Cy. Their founder, Henryk Rubinstein, was the first to produce a digital newspaper system. Its RATS format is now used in Sweden, Finland, Belgium and the Netherlands. It can be read by the company's browsing program only. A prototype conversion software to go from RATS to SGML was demonstrated in 1993.
CBB: The CBB organization in the Netherlands distributes via modem dozens of magazines and several newspapers. They developed their own format and a high speed navigation/brow-sing program that is accessible by blind persons.
IBM Bookmanager: The DOS reading program is accessible by blind persons, working with speech output or Braille reading lines connected to their computer. It was therefore accepted a few years ago by the USA organization Recording for the Blind as their standard for electronic books. According to G.Kerscher, technical director for RFB's E-Text section, the next step will be towards SGML.
(Table 4-7) Overview of application or company related "standards".
Word-processing Formats - Comments
Wordperfect: 8 bit files Layout codes intermixed with text; in theory all versions (since 5.0) have identical file format; detailed description of file format is available (to developers).
Word: 8 bit files At least five fully different file formats in use (Word for Dos 5, Word for Dos 5.5, Word for Windows 2, Word for Windows 6 and Word for Macintosh). All file formats are extremely complicated.
Rich Text Format (RTF) This is an exchange format (7 bit-characters) developed by Microsoft. It contains all format-ting instructions in a readable form mixed with the text. It is highly efficient for exchanging documents as it is readable and writable by all Microsoft Word versions and some Wordperfect versions.
(LA)Tex This is a word-processing/typesetting language with a structure similar to RTF but more flexible. It is mainly used by scholars and researchers. It is extremely powerful for handling mathematics. Mathematical formulae written in Latex can be pronounced in synthetic speech (TV. Raman).
(Table 4-8) Some examples of word-processing formats.
ISO Standards for Structured Texts
- Standard Generalized Markup-Language (SGML):
Documents are 7 bit files and contain the text and the logical structure (using tags) of a document (Bauwens, 1995; Maunder, 1994). Layout of the printed document is left to the end user. The preferred format for long term storage of documents is SGML (this format is computer and software independent). Logical structures are encoded in Document Type Descriptions (DTDs), the most popular ones being ISO 12083, CALS, AAP's Smallbook DTD and HTML/HTML+ (used by electronic databases in the World Wide Web).
- Open Document Architecture (ODA):
ODA has been studied as a possible alternative for the SGML language. There has not been much interest for it by software providers up to now. This may change as the Novell-Wordperfect Corporation intends to build ODA tools in the near future. ODA is a machine-independent format but with layout instructions according to ISO standard 8613.
A recent development in currently available information stored on CD-ROMs or in electronic format is the inclusion of audio into documents, which previously consisted exclusively of text and graphics (e.g. encyclo-paedias).
The lack of standards in the audio format also poses a problem. Only de facto standards exist for:
- how to refer to the audio information in a document;
- how to store audio information in a digital format for a document.
This situation might seriously hinder exchange of documents between emerging applications. This is already the case between the most advanced existing audio application areas i.e. telephony voice messaging, and multimedia personal computing.
The flow chart for human reading of accessible text formats is shown in Figure 4-9. The actual reading (with the appropriate output devices) is separated from the navigation through the document.
This also implies that the reader does not need to know the type of structured files he is reading. Using a particular format has indirect consequences as it determines the level of structure that is available for navigation.
Accessibility of information depends on the intended use: is it for interactive reading (with Braille reading lines or speech output) or for producing printed documents (e.g. in Braille, Moonwriting or large print) ? Both classes of applications are possible with SGML files.
As word-processing formats are much more widely used than SGML or ODA, an important success factor is the possibility to convert word-processing formats into SGML (or ODA).
Table 4-9 gives some practical information.
It should be noted that currently several special programs exist that directly convert word-processing formats into HTML+ the much simplified SGML based language used on the World Wide Web.
Input Formats - Conversion Tools - Comments
Wordperfect - Intellitag add-on software to WP - Use of
stylesheets is required
Microsoft Word - SGML Author (co-operates with WfW6) - Use of stylesheets is required
Pagemaker, Quark - Only indirectly - Export to WP or Word format is required
Special SGML editors - OK - e.g. Author Editor, Writeit, Grif etc.
(Table 4-9) Up-conversion, word-processing to general SGML format.
The Promotion of Accessible Texts
Several groups are currently promoting the use of structured texts. The most important ones are:
ICADD, the International Committee for Accessible Document Design
The main purpose of this group, established at the first World Congress on Technology for the Disabled (Washington, Dec. 1991) is the definition of a universal text format for the large amounts of textual information that will soon be available.
The impetus was given by the urgent need of harmonization in this field, partially as a consequence of the American with Disabilities Act, but the idea of a world-wide brainstorming group was heavily supported by the non-US delegations as well.
Current ICADD actions are focused on organizational and technical aspects. Research activities in the field of SGML include:
- the finalizing of the ICADD Document Type Description
- the ICADD mechanism to make general DTD's accessible (this was done for the new ISO 12083 standard and will be part of HTML 3);
- the study of HTML (the WWW's files format) and its accessibility, including its table and mathematical constructs.
Lobbying in related issues towards the US Government is also being actively pursued in order to avoid Adobe's PDF/Acrobat format (ADOBE MAGAZINE, 1994) becoming a US standard. In the beginning of 1995 Adobe agreed to discuss Acrobat's accessibility issues with the ad hoc ICADD subcommittee (Kerscher, 1994).
Off-line Braille Production
Almost no general word-processing program can be made flexible enough to drive directly the existing Braille embossers. Therefore special programs are used and many countries have their favourites. The more general ones are Megadots (David Holliday), Duxbury (Joe Sullivan), PCBraille (Guido Francois) and ITS (Keith Gladstone).
Braille can also be produced from SGML files (Engelen, 1992a, 1992b). Also, SGML documents in the ICADD 22 tag DTD are easily convertible into Braille. Basic research on this aspect was done by ICADD (Tom Wesley) and at UCLA (Jeff Suttor).
Access for Print Disabled People
In Europe, a project mainly concerned with document access for print disabled people has developed a newspaper format in SGML DTD (CAPSNEWS DTD) which describes a totally general newspaper structure. It also has some special provisions for visually impaired persons, which enables them to navigate through digital newspapers by means of large print on screen, voice synthesis, and Braille display readers (TIDE-CAPS, 1994).
The benefits of structured document formats, both for print disabled people and for publishers, will be stressed throughout a new European Horizontal Action TIDE Programme, HARMONY.
Sources of Accessible Texts
As explained above, the large expansion of computer word-processing, which is gradually replacing type-writing, guarantees a steady and ever-increasing flow of electronic documents. Many of these can be made accessible but the conversion can be rather cumber-some for large documents that need imbedded structure information.
A common misconception is that CD-ROM files are either ASCII or SGML files. Whilst this can be the case, most CD-ROM databases have to be read with special programs and the actual data files are in proprietary formats. An example of this is the Nuffield Interactive Book System (NIBS) (contact: Bob Allen).
Hybrid CD-ROMs contain the electronic version of the text (preferably in SGML format) and the spoken version of the same text. It is expected that this combination will replace talking books, that were traditionally recorded on cassette tapes, very soon.
Network files, Internet and the World Wide Web
Huge amounts of public domain texts are becoming available on diskette or through all kinds of electronic mail, especially for those who are linked to the Internet.
Internet texts used to be only available in simple ASCII format. Although the limitations of ASCII are well known (see above) it is a simple format that can be read by all computers.
The World Wide Web
This computer network protocol has already been mentioned several times. It is unique in the sense that the WWW documents are distributed on the global network and the user can browse through these documents without bothering about the physical place where they are kept or about their format.
The WWW documents are kept in SGML, more specifically in a structure defined by the HTML+ DTD. The user (client) has to have a special program for navigating through the Web. These are the browser programs with names such as Mosaic, Lynx, Netscape, Winweb etc.
The most popular ones (available on various platforms) are definitely Mosaic and Netscape. Due to their graphical nature, the current versions are not accessible by blind persons. Recently a working group was set up to guarantee that further versions can be used by reading impaired persons.
Lynx, on the other hand, is basically line oriented and therefore accessible with speech output or a Braille reading line.
The HTML format can also handle multimedia documents, but it does so by relying exclusively on de facto standards.
Two newspaper examples can be given here: the New York Times is on the WWW, but in inaccessible PDF format, the Telegraph also is available but uses accessible HTML files.
General SGML based systems
General SGML systems can provide much more information than the HTML based files. A structure description for newspapers has been developed (TIDE-CAPS, 1994) which is currently in use by several German newspapers that are produced electronically with the support and the technical expertise of the Stiftung Blindenanstalt in Frankfurt.
Starting in the autumn of 1995, the Belgian newspaper De Standaard and InfoVisie Magazine, a technical publication on assistive devices for blind and deaf-blind persons (Contact: Geert Bormans) will be distributed in this format.
Accessibility of information is a complex issue. In any case the use of standardized text formats, based on international agreements such as ISO norms, should be promoted. Many more texts (e.g. newspapers) will become available in these formats.
Future technological possibilities will improve or com-plement the present communication means of disabled people.
Automatic speech treatment would in particular bring new solutions. But if we want these systems to have sufficiently high performance and to react in real time, they become too expensive to integrate them into the user's terminal.
A solution is to forward the communications to host computers which can easily realize many complex ope-rations in real time.
Host-based communication systems
Such systems are interesting because the current user's terminal can be a standard one with only slight modifi-cations due to a person's disability. The additional cost corresponding to the service provided, is shared by all the users of the telecommunications services.
(Figure 4-11) How to communicate between different persons; the concept of media transformation by service providers and telecom operators.
Some examples illustrate the above suggestions:
Implementation of a telephone communication between a deaf, but normally speaking, person and his corre-spondent who can hear and speak.
The deaf person speaks normally to his correspondent. In return, the correspondent's speech is recognized by a central computer and converted into a written text which is simply coded and displayed on the deaf person's terminal (Minitel, textphone or personal computer, see chapter 4.1).
Implementation of a videotelephone communication between a post-lingually deaf, but normally speaking, person and his correspondent (see chapter 4.2).
The deaf person speaks normally in front of the videotelephone as does his correspondent. The deaf lip-reads the video image but gets additional information thanks to marks displayed near the face picture. This systems aims at eliminating lip reading ambiguities. These marks correspond to phonetic features charac-teristic of speech which are detected by automatic recognition of the speech. This system works according to a lip reading help method called Cued Speech.
Speech Impaired Users
Speech synthesis over the network
Implementation of a telephone communication between a speech impaired person who can hear and his correspondent (hearing-speaking) who has a traditional telephone.
This system has been tested at CNET for many months. The disabled person hears his correspondent normally by using the telephone. In the other direction, written text is typed by the speech impaired user (e.g. on a Minitel) and is then converted into synthetic speech by the host computer and sent to the correspondent's destination.
Applications for blind or visually impaired persons
A host computer collects automatically original infor-mation (newspapers for example) and presents them to the disabled person in synthetic voice (Audiotext). The same computer could also calculate and send codes that can activate a remote Braille printer.
Many other services based on the host computer principle can be suggested, such as: information search through image processing and specific services to elderly people. In that case the host computer detects the calling person's code and reacts according to pre-recorded instructions.
Such host-based systems are interesting mainly because the disabled person controls the communication course himself and he (she) can gain a great deal of experience (for example, in the case of the host computer for speech impaired people, it is the user himself who types the text and his correspondent normally uses the telephone).
A serious problem is still to be solved: how to fix the price. In fact, the telephone companies impose a tax as soon as the dialled number has been reached (in this case the host computer), and payment for the host's outgoing line represents a further technical billing problem.
There are solutions using the "videotex access points" for example, or by simply establishing a moderate time charge as it is the case in videotext systems.
Another problem is the time of conversion from oral communication to written communication. For example, the conversion of speech into text needs to recognize a connection signal and a specific code.
This start-up time can last 2 or 3 seconds and this causes some trouble for the non disabled person who assumes that a wrong connection was made to his telephone.
Generally speaking these host-based services also require great ergonomic efforts because they use a new communication technique, and a period of adaptation is needed.
New telematic services can alleviate the funding problem for sophisticated computer processing, if they can be handled centrally, e.g. by telecommunications operators who are willing to invest in service provision.
Electronic Mail Addresses
Bob Allen (CRC), e-mail: Bob.Allen@eurokom.ie
Geert Bormans (Infovisi), e-mail: firstname.lastname@example.org
Jan Engelen, e-mail: Jan.Engelen@kuleuven.ac.be
Ian Feldman, e-mail: email@example.com
Guido Francois (Interpoint), e-mail: Guido.Francois@esat.kuleuven.ac.be
Keith Gladstone (Integral Transcription System), e-mail: firstname.lastname@example.org
David Holliday (Raised Dot Computing), e-mail: email@example.com
George Kerscher (ICADD technical WG), e-mail: firstname.lastname@example.org
BTV Raman, e-mail: email@example.com
Joe Sullivan (Duxbury), e-mail: firstname.lastname@example.org
Jeff Suttor, e-mail: jSuttor@library.ucla.edu
Gregg Vanderheiden (TRACE), e-mail: email@example.com
Tom Wesley (President ICADD), e-mail: firstname.lastname@example.org
ADOBE MAGAZINE, (1994). Acrobat - Bridge between worlds, Adobe Magazine, No 1, pp. 24-25.
BAUWENS, B., EVENPOEL, F. and ENGELEN, J.J. (1995). Standardization As A Prerequisite For Accessibility Of Electronic Text Information For Persons Who Cannot Use Printed Material, IEEE Transactions on Rehabilitation Engineering, Vol. 3, 1 (March 1995).
CAPSNEWS DTD, anonymous ftp from ftp.esat.kuleuven.ac.be (subdirectory: pub/CAPS).
ENGELEN, J.J., BALDEWINJS, J. (1992). Digital Newspapers for the Print Disabled, State of the Art. p. 198, ISBN 90-6831-27X-X. InfoVisie, Leuven, (will be updated in spring 1996).
ENGELEN, J.J. and BALDEWINJS, J. (1992). Digital information distribution for the reading impaired: from daily newspapers to whole libraries, Schriftenreihe der Oester-reichische Computer Gesellschaft, Band 48, pp. 155-162. Vienna, 7-9 July 1992.
GILL, J.M. (1993). Access to Graphical User Interfaces by Blind People. TIDE-GUIB Consortium, ISBN 1 85878 004 7.
KERSCHER, G.W. (1994). Best Index/Hypertext READER?", e-mail message to "Blind-L, 1994. The author can be contacted via e-mail.
MAUNDER, C. (1994). Documentation on Tap, IEEE Spectrum, Sept. 1994, p. 52.
SETEXT. "setext-concepts" document. It can be obtained via e-mail by contacting Ian Feldman.
TIDE-CAPS136, (1994). Final report and the SGML State of the Art report, both available from Jan Engelen or by anonymous ftp from ftp.esat.kuleuven.ac.be (subdirectory pub/CAPS).