customer support



content DISABLED  ASSISTIVE  TECHNOLOGY



Recognition for Disabled Assistive Technology Devices

  • Summary:
The U.S. Government is placing increasing importance on the development of assistive technology for disabled persons, both as a matter of civil rights for these persons and the desire to make them more independent and productive members of society.  The passage of the Americans with Disabilities Act (ADA) requires that public and private entities make all reasonable accommodations for disabled persons.  This has led to design changes and technological advances that have been very useful to persons with disabilities.

One assistive technology development area that has lagged is Voice Recognition.  Useful and effective Voice Recognition (VR) technology would allow many disabled persons to lead more independent and productive lives.  But the voices of some disabled persons are unrecognizable by currently available VR systems.  These systems rely on the recognition of separate parts of speech -- phonemes -- to determine the words spoken.  Non-standard speech cannot be processed effectively by these systems.

The General Accounting Office, in a 1996 report, "People with Disabilities: Federal Programs Could Work Together More Efficiently to Promote Employment," noted that Congress has worked to find means of reducing the $60 billion spent each year on disability assistance payments.  One of the most effective way of making disabled persons more self sufficient is to develop more advanced assistive technologies, and the GAO specifically cited voice recognition for computers as a promising technology.

But many persons receiving disability payments have impaired speech.  The group of persons with diseases or injuries involving impaired speech is large.  Approximate numbers for some are:

  • 700,000 with cerebral palsy

  • 350,000 with multiple sclerosis

  • 500,000 with Parkinson's Disease

  • 2 million new cases annually of Acquired Brain Injury

  • 4 million with Alzheimer's Disease

  • 4 million with epilepsy and

  • 4 million stroke survivors

Other important groups for which numbers were not immediately available are:
  • Throat and mouth cancer and

  • Other muscular diseases

For some groups, only some persons suffer from impaired speech.  For others, such as cerebral palsy and Parkinson's Disease, virtually all members suffer from impaired speech.

Integrated Wave Technologies, Inc. (IWT), a Silicon Valley company, has developed an advanced VR system based on a fundamentally different approach that recognizes all non-standard speech.  This system is based upon frequency analysis research performed in the former Soviet Union for the purpose of VR, speaker identification and the identification of non-speech sounds such as submarine noises.  The IWT VR can recognize virtually any sound a person can repeat, making it ideal for use with persons with severely distorted speech.

For disabled persons with clearer speech, the system contains a significant advantage.  Because the IWT VR system recognizes speech with much greater precision than existing systems, there are virtually no errors.  Persons with limited mobility have a difficult time getting a computer to recover after a recognition error.  The IWT VR system will allow these people to use computers and other devices with much greater ease, less frustration and less fatigue.

Another advantage of the IWT system is that it was designed to work on very simple microprocessors, even down to the equivalent of a 286.  This means that it can be "hard wired" into devices such as telephones, televisions, wheelchairs, beds and other products needing voice control.

The IWT VR system was chosen in 1996 by the Justice Department to form the basis of a belt-mounted, voice-prompted translator.  The system has passed its initial developmental test and will be field tested by the Oakland Police Department in mid-1997.

  • Background:
Integrated Wave Technologies, Inc. (IWT), is a high technology company that focuses exclusively on sound analysis-related R&D, marketing and production.  The company is based in Fremont, California, and it has research facilities in Oregon, and Moscow, Russia.  It has acquired highly advanced sound analysis technology from sources in the former Soviet Union, and has adapted and advanced this technology in its own research facilities.  These facilities are capable of software development, prototype development, and custom-designed chip fabrication.

The company's Chairman and Chief Executive Officer is John H. Hall.  Hall has over 30 years of advanced electronics research experience and has made significant contributions to advancements in the state-of-the-art.  Among his innovations are the first low-voltage CMOS chip, a design that Seiko purchased and mass produced beginning in 1970.  Hall has continued to make advancements in CMOS and other electronics technology. (See attached resume).  He designed and currently manufactures the most effective hearing aid semiconductor chip in use.

  1. Description of the Disabled Assistive Technology Challenge:

    The Americans with Disabilities Act and the commissions it has helped to create have created increased interest in the application of voice recognition command and control technology for the task of providing homes and workplaces with tools to allow disabled people to become more self sufficient and product.

    The passage of the ADA has led to the establishment of task forces and other groups having the charter of finding new technologies to assist disabled persons.  One of the most significant, the National Task Force on Disability, heard testimony that Microsoft Corp. was slow to adapt its products for use by disabled persons.  World Institute on Disability Vice President and Director of Technology Policy Deborah Kaplan told a 1995 task force meeting that Microsoft had issued a corporate policy statement promising to ensure accessibility for people with a wide range of disabilities after the firm had been criticized by blind computer users and other persons.  Kaplan urged task force members to keep up the pressure on high-tech manufacturers to make their products accessible.

    In other areas, current technology has not kept pace with public policy requirements.  A 1996 settlement between the State of California and the Justice Department requires that the state make its 911 emergency services as accessible to people with speech impairments as it is to others.  Yet existing VR technology will not recognize many persons with impaired speech, and these persons often suffer also from impaired mobility.

    IWT has identified an application of this sound analysis technology that will provide a solution to the challenge of producing an effective and reliable voice recognition (VR) device for people unable to speak clearly identifiable words.

  2. What are the limitations of existing voice recognition devices?

    Existing VR systems have a great deal of difficulty recognizing non-standard speech.  This means that they cannot be used effectively by persons with impaired voices.  Existing systems also have a error-substitution rate that is high compared with the IWT VR system.  This leads to computer miscues that are often difficult for a disabled person to correct.  This VR errors lead to low productivity, high fatigue and frustration.

    Existing systems also require the use of Pentium-level computer, which greatly increases cost and prevents the miniaturization of VR devices using current software.

  3. What technology would address this requirement?

    Voice recognition software capable of accurately recognizing the voice commands of police officers in a harsh noise environment using sub-Pentium processors is needed to meet the requirements of disabled persons.

    IWT's work with the Soviet-conceived sound analysis technology is capable of being the basis for such an assistive system.  Scientists and engineers at these laboratories, in seeking to develop speech identification and other sound analysis programs, had taken an approach that is fundamentally different from that used in voice recognition systems developed in the United States and other Western countries.

The main features of this Soviet-based system are:

  • That it uses advanced algorithms that use a small fraction of the software required by U.S.-style systems;

  • That these algorithms are able to filter out stray noises and thus have error/substitution rates substantially lower than U.S.-style systems; and

  • That its microphone doesn't have to be turned off to avoid erroneously recognizing commands, making it truly hands free.

Differentiation/Specifications/Criteria: A successful VR assist technology would have the following characteristics. It would:

  • Operate in a completely hands-free manner

  • Recognize non-standard speech

  • Recognize only intended commands

  • Adapt easily to laptops and other computers in inventories

  • Be affordable

Hands-free, operation/ Recognize only intended commands.  The IWT voice recognition software accepts only the commands it has been "trained" to recognize.  All other noises -- even human voices -- are rejected as extraneous noise.  This allows the system microphone to be active at all times without the danger that the device will erroneously accept a false command.

Recognize non-standard speech.  The IWT VR software will match virtually any sound with a recorded sample.  This means that speech that does not resemble the standard phonemes of English speech that form the basis of other VR systems is still recognized.

Easily adaptability for use in laptops and other computers in inventories.  The IWT voice recognition software works well on microprocessors from 286s upward.  The voice recognition algorithm can also be "hardwired" into a series of integrated circuits for specialized applications such as activation and operation of systems such as radios and lights/sirens.

Affordability.  The ability of the IWT software to work on relative simple hardware systems and virtually all microprocessors using DOS means that it does not drive computer costs beyond current levels.

  • Description of Use of IWT VR Assist Approach:
Recent tests of IWT's voice recognition, which is based on non-traditional sound analysis techniques, have show that the shortcomings of the current generation of VR software being marketed for assisting disabled persons can be overcome.  Currently marketed software relies on recognition of standard parts of speech -- phonemes -- and artificial intelligence to determine the words spoken.  This results in errors when non-standard speech and background noise are heard by the system.  This a fundament design problem and cannot be corrected by hardware or software enhancements.

IWT's approach is based upon technology developed by top scientists of the former Soviet Union to analyze sound of any kind, not merely voice.  The demanding requirements for this software included identifying specific persons by words spoken and specific submarines by the noise of their propellers.  This means that the IWT software can recognize with great precision speech even if it differs greatly from standard speech.

Because the Soviets designed this technology using computers that are primitive by Western standards, it will run on highly compact devices.  While currently marketed VR software is recommended for use on Pentium computers, the IWT software will work well on processors as simple as the 286.

IWT has achieved impressive results using the sound-based VR technology.  Because it is based on sound rather than the spoken word, it can recognize the sounds made by persons with cerebral palsy.  Due to its better than 99.75% false command issue error rate and its very high resistance to background noise, this technology can virtually eliminate the false commands.

  • Conclusion:
The Integrated Wave Technologies, Inc., Voice Recognition system will allow public and private entities in the U.S. to realize more fully the public policy requirements articulated by the Americans with Disabilities Act. provides the only means of assisting disabled persons effectively.  The system will allow persons with non-standard speech to operate computers and virtually all electronic devices.  Disabled persons with clear speech will have significantly more effective control of computers and other devices.  And the IWT system will allow the miniaturization and integration of VR into devices that currently have no such capability.

Technical Discussion of IWT Voice Recognition Algorithms:

  • Introduction:
IWT voice recognition technology performs in a robust manner using novel signal processing methods.  The accuracy of the system exceeds 95% in adverse conditions using different communication channels and in the presence of background noise.  The core technology is very efficient and inexpensive to implement: A standard 8-bit audio/digital converter and a 286 or faster central processing unit is required.

  • Background:
The core technology was developed in the former Soviet Union, in a atmosphere where expensive and complicated resources were limited.  Russian scientists were forced to use inferior (by Western standards) computing machinery.  To get results, they had to rely on elegant, yet parsimonious, algorithms to achieve comparable results being accomplished in the West with more powerful computers.

In the 1960s, Vinstyuk first proposed the use of dynamic programming methods for time-aligning a pair of speech utterances.  Although the essence of the concepts of dynamic time warping, as well as rudimentary versions of the algorithms for connect-word recognition, were embodied in Vinstyuk's work, it was largely unknown in the West and did not come to light until the early 1980s -- long after more formal methods were proposed and implemented by others.

A significant milestone in voice recognition work was achieved in the 1970s by Velichko and Zagoruyko.  They created perhaps the first viable and useful voice recognition system.  These Russian studies helped advance the use of pattern-recognition ideas in speech recognition.  It should be noted that these studies predated those by Sakoe and Chiba in Japan and Itakura in the U.S.

The work in the Soviet Union continued on with an emphasis in robust voice recognition and voice identification for use in military and covert operations.  A wealth of commercially available potential research soon became available after the fall of the Soviet system.  IWT secured the commercial rights to the most significant and applicable research.  The technical details have not been published so as to protect these rights.

  • IWT Approach to Voice Recognition
Broadly speaking, there are three approaches to speech recognition:

  • The acoustic-phonetic approach

  • The artificial intelligence approach

  • The pattern recognition approach, which is used by IWT

The acoustic-phonetic approach is straightforward.  The machine attempts to decode the speech signal in a sequential manner based on the observed acoustic features of the signal and the known relations between acoustic features of the signal and the known relations between acoustic features and phonetic symbols.  It is a viable approach and has been studied in great depth for more than 40 years.

However, for a variety of reasons, the acoustic-phonetic approach has not achieved the same success in practical systems.  The central problem is the extreme difficulty in getting a reliable definitions of phonemes, i.e., segmenting the speech into discrete regions where the acoustic properties of the signal are representative of one (or possibly several) phonetic units (or classes) and then attaching one or more phonetic labels to each segmented region according to acoustic properties.

A second problem is that, once the labels have been defined, a valid word must be determined from the sequence of phonetic labels (usually in the form of a phoneme lattice) that can have many permutations for a given word or phase.

The artificial intelligence (AI) approach attempts to combine the above phonetic approach with the power of an expert system that integrates phonemic, lexical, syntactic, semantic and pragmatic knowledge.  Although some of the limitations of the acoustic-phoneme approach can be overcome using AI, the complexity of the task makes it unsuitable for small, portable applications, or in applications where costs must be kept low.

The pattern-recognition approach is the basis for the IWT speech recognizer.  It has three qualities that lead to superior performance in applications such as is the subject of this grant proposal:

  1. Simplicity of use.  The method is easy to understand, rich in mathematical and communication theory, and is widely used and understood.

  2. It is robust and invariant to different speech vocabularies, users, languages, word vocabularies, talker populations, background environments, and transmission conditions.

  3. Proven high performance.  The pattern-recognition approach to speech recognition consistently provides high performance on any task that is within its technological parameters and provides a clear path for extending the technology in a wide range of directions.

The pattern-recognition approach is better suited for the conditions to which a hand-held law enforcement device will be subjected for the following reasons:

  1. The signal processing front end provides a set of unique filter bank parameters that are consistent over a wide range of speakers and communication channels.

  2. The Filter Bank parameters are transformed into a set of Principal Features (PF) that are statistically determined to remove redundant data across the vocabulary.

  3. The PF is transformed into frame pairs that model the statistical correlation between nearby speech frames.

  4. The system employs a modified dynamic-time-warping (DTW) process in which all templates are scanned continuously.  The system then relaxes end-point constraints of the input utterance and updates allowable paths of the utterance.

  5. The algorithm works for speaker-dependent and speaker-independent recognition.

  6. The system works in a fast and efficient manner.
  • Algorithm Overview
Acoustic waves are converted with an 8-bit analog-digital converter (ADC) at a sample rate of 12.8 K/sec.  The PCM data is placed into a circular buffer that is continuously updated.  The input data is converted into a stream of parameters in the preprocessor.  This secondary stream of data is converted into 8-dimension (8-D) feature vectors every 20 ms.

A word to be recognized is recognized against a "template" that is initially recorded during the "training" process.  There are stored in external memory.  A resident set of templates in memory defines the vocabulary.

Consider the input "utterance" as a set of feature parameters that stream in continuously." To consider this utterance as a candidate for recognition, a front-end processor is needed to "grab" the utterance.

Once the utterance is captured, it is compared against the templates in memory using a comparison technique known as a dynamic time warping (DTW) algorithm.  The DTW provides the best time alignment of two utterances (unknown and template).

However, instead of the common DTW algorithm, the comparison is performed continuously.  This means that the input is estimated every time a feature vector comes from the preprocessor, i.e., every 20 ms.

Accurate end-point detection is crucial for accurate voice recognition.  Tests have shown that small variations in end-point detection, such as +/- 40ms, can reduce accuracy by 3%.  The method used in this algorithm reduces these end-point errors, quickly sorts out unlikely templates, and shows promise for continuous speech recognition.

  • Inside the Pre-Processor
The pre-processor part of the speech recognition algorithm converts the input signal waveform into a stream of feature.  There are two stages to this: the primary transformation into the time to the spectral domain; and the statistically based method to obtain a more compressed and reliable feature vector.  The first is realized by means of a quasi-synchronized (with FO, the fundamental or glottal frequency) 17-band filter bank.  The second is a frame-pair conversion using a Karhunen-Loeve transformation (KLT or principal feature method).

  • Conclusion
The speech recognition algorithm used by IWT is very accurate and fast.  It encompasses many of the "proven" techniques used in commercial speech recognizers, along with many novel techniques that have been added to improve system performance.  It uses low cost hardware (8-bit analog-digital converter) and low computational overhead, typically well under 5% total on a 486-33 PC.

It should be noted that other methods, such as using LPC for the preprocessor, have been investigated thoroughly, but shown to have lower performance due to added complexity.  In addition, the use of "hidden Markov models" (HMM) has also been investigated.  HMMs are widely used for large vocabulary systems and for some speaker-independent systems.  However, the reliability of using HMMs for reliable and robust command-and-control voice recognition does not perform as well as template-based approaches.

T.K Vinstyuk
"Speech Discrimination by Dynamic Programming"
Kibernetika, 4(2): 81-88, Jan./Feb. 1968.

V.M Velichko and N.G. Zagoruyko
"Automatic Recognition of 200 Words"
International Journal of Man-Machine Studies, 2:223, June 1970.

H. Sakoe and S. Chiba
"Dynamic Programming Algorithm Optimization for Spoken Word Recognition"
IEEE Tans. Acoustics, Speech, Signal Proc., ASSP-26 (1): 43-49, February 1978.

F. Itakura
"Minimum Prediction Residual Applied to Speech Recognition"
IEEE Tans. Acoustics, Speech, Signal Proc., ASSP-23(1): 67-72, February 1975.

L. Rabiner and B. Juang
"Fundamentals of Speech Recognition"
Prentice Hall Signal Processing Series, 1993.

L. Rabiner and B. Juang
"Fundamentals of Speech Recognition"
Prentice Hall Signal Processing Series, 1993.

J.G. Wilpon, L.R. Rabiner, and T.B. Martin
"An improved word-detection algorithm for telephone-quality speech incorporating both syntactic and semantic constraints"
AT&T Tech. J., 63(3): 479-498, March 1984.

E.L. Bocchieri and G.R. Doddington
"Frame-Specific statistical features for speaker independent speech recognition"
IEEE Trans. on Acoustics, Speech & Signal Processing, 34(4), August 1986.

The following legislation was introduced by Sen. Christopher Bond, R-MO, after he became familiar with IWT's work with persons with cerebral palsy.  IWT is working with the United Cerebral Palsy Association of Greater St. Louis, as well as the national organization, to implement a larger-scale testing program.

[Page: S6407]

By Mr. BOND:

S. 2173. A bill to amend the Rehabilitation Act of 1973 to provide for research and development of assistance technology and universally designed technology, and for other purposes; to the Committee on Labor and Human Resources.

ASSISTIVE AND UNIVERSALLY DESIGNED TECHNOLOGY IMPROVEMENT ACT FOR INDIVIDUALS WITH DISABILITIES.

Mr. BOND.

Mr. President, today I am introducing a bill which will improve assistive and universally designed technology research and development and increase access to this technology for all Americans with disabilities.

Assistive and universally designed technology provides a disabled individual the means to function better in the workplace or the home.  Assistive and universally designed technology is technology that aids the millions of Americans with physical or mental disabilities.  For example, assistive technology can mean a computer that can be used by an individual with Cerebral Palsy, a hearing aid for an aging individual or enhanced voice recognition for someone with Multiple Sclerosis, while universally designed technology can mean closed captioning for the deaf or for patrons in crowded restaurants and accessability ramps for individuals in wheelchairs or mothers with strollers.

A year ago my office was approached by a small business owner and Missouri's United Cerebral Palsy asking for support for testing of a breakthrough in Voice Recognition technology.  During my search to find an appropriate place for funding for this voice recognition technology, my staff and I became familiar with the overall government efforts in this area.

There are many significant problems in the federal government's efforts in assistive technology research and development.  My finding's were validated by a recent report from the National Academy of Sciences' Institute of Medicine, Enabling America: Assessing the Role of Rehabilitation Science and Engineering, which stressed that the federal government's efforts in this area are lacking awareness, funding, and coordination.

My distinguished colleague in the House, Congresswoman Connie Morella, Chairwoman of the House Science's Subcommittee on Technology, joins me today in introducing the Assistive and Universally Designed Technology Improvement Act for Individuals with Disabilities.

The Act provides federally supported incentives in all areas of assistive and universally designed technology, including need identification, research and development, product evaluation, technology transfer, and commercialization.  These incentives achieve the goal of improving the quality, functional capability, distribution, and affordability of this essential technology.

This legislation does several things.

First, the bill includes an improved peer review process at the National Institute on Disability Research and Rehabilitation (NIDRR) at the Department of Education.  This provision requires standing peer review panels and clarifies the evaluation of applications for funding of assistive and universally designed technology.  These improvements provide more assistive and universally designed technology products to the marketplace, increase small business involvement in research and development, and assure research and development efforts cover all disability groups including persons with physical and mental disabilities as well as the aging and rural technology users.

Second, the legislation augments technology transfer through improving the role of the Interagency Committee on Disability Research (ICDR) by increasing its authority, accountability and ability to coordinate.  Provisions are included for increased usage of the Federal labs to improve coordination with all Federal agencies involved in assistive and universally designed technology research and development and for providing public and private sector partnerships for assistive and universally designed technology research and development.

Third, to increase the market for assistive technology, the bill clarifies Title III of the Tech Act for the Microloan program.  This microloan program assists disabled persons in obtaining assistive and universally designed technology.

Fourth, funds are authorized for the Interagency Committee on Disability Research to hire staff and for operating costs associated with issuing surveys and reports.  Additionally, $10 million in funds are authorized for the National Institute on Disability Research and Rehabilitation to provide for assistive and universally designed technology research and development.

Finally, to increase access to assistive and universally designed technology, tax incentives are included to provide businesses a tax credit for the development of assistive technology, to expand the architectural and transportation barrier removal deduction to include communication barriers, and to expand the work opportunity credit to include expenses incurred in the acquisition of technology to facilitate the employment of any individual with a disability.

These tax incentives and micro loans will assist individuals with disabilities to obtain assistive and universally designed technology in order to improve their quality of life, to secure and maintain employment, and to assist small businesses in complying with Americans with Disabilities Act requirements, which in effect, results in lessened financial burdens on society.

As technology increasingly plays a role in the lives of all persons in the United States, in the conduct of business, in the functioning of government, in the fostering of communication, in the transforming of employment, and in the provision of education, it also greatly impacts the lives of the more than 50 million individuals with disabilities in the United States.

An agenda, including support for universal design, represents the only effective means for guaranteeing the benefits of technology to all persons in the United States, regardless of disability or age, in addition to assuring for United States industry the continued growth in markets that will warrant continued high levels of innovation and research.  This legislation has the support of many organizations, including: The Missouri Assistive Technology Advisory Council, the United Cerebral Palsy Association, the Rehabilitation Engineering and Assistive Technology Society of North America, the National Easter Seal Society, and the Association of Tech Act Projects.

The bill also has broad bipartisan and bicameral support.  My colleagues, Senator Jeffords, Senator Harkin, Senator Grassley, and Congresswoman Connie Morella have been very helpful in my efforts to improve the role of the federal government in assistive and universally designed technology.

Let me conclude by taking special note of the help of the National and Missouri United Cerebral Palsy, as well as the Missouri Assistive Technology Project, the Federal Laboratory Consortium, and the numerous assistive and universally designed technology and disability community advocate organizations, for their assistance in developing and advocating this legislation.