Jump to content
RemedySpot.com

voice recognition software

Rate this topic


Guest guest

Recommended Posts

hhere is a very interesting articlre. its web site is

http://www.usc.edu/ext-relations/news_service/releases/stories/36013.html

Machine Demonstrates Superhuman Speech Recognition

Abilities

University of Southern California biomedical engineers

have created the world's first machine

system that can recognize spoken words better than

humans can. A fundamental rethinking of a

long-underperforming computer architecture led to their

achievement.

The system might soon facilitate voice control of

computers and other machines, help the deaf, aid

air traffic controllers and others who must understand

speech in noisy environments, and instantly

produce clean transcripts of conversations, identifying

each of the speakers. The U.S. Navy, which

listens for the sounds of submarines in the hubbub of

the open seas, is another possible user.

Potentially, the system's novel underlying principles

could have applications in such medical areas

as patient monitoring and the reading of

electrocardiograms.

In benchmark testing using just a few spoken words,

USC's Berger-Liaw Neural Network Speaker

Independent Speech Recognition System not only bested

all existing computer speech recognition

systems but outperformed the keenest human ears.

Neural nets are computing devices that mimic the way

brains process information.

Speaker-independent systems can recognize a word no

matter who or what pronounces it.

No previous speaker-independent computer system has

ever outperformed humans in recognizing

spoken language, even in very small test bases, says

system co-designer Theodore W. Berger,

Ph.D., a professor of biomedical engineering in the USC

School of Engineering.

The system can distinguished words in vast amounts of

random " white " noise — noise with

amplitude 1,000 times the strength of the target

auditory signal. Human listeners can deal with only

a fraction as much.

And the system can pluck words from the background

clutter of other voices — the hubbub heard

in bus stations, theater lobbies and cocktail parties,

for example.

Even the best existing systems fail completely when as

little as 10 percent of hubbub masks a

speaker’s voice. At slightly higher noise levels, the

likelihood that a human listener can identify

spoken test words is mere chance. By contrast, Berger

and Liaw’s system functions at 60 percent

recognition with a hubbub level 560 times the strength

of the target stimulus.

With just a minor adjustment, the system can identify

different speakers of the same word with

superhuman acuity.

Berger and system co-designer Jim-Shih Liaw, Ph.D.,

achieved this improved performance by

paying closer attention to the signal characteristics

used by real flesh-and-blood brains in

processing information.

First proposed in the 1940s and the subject of

intensive research in the '80s and early '90s, neural

nets are computers configured to imitate the brain's

system of information processing, wherein data

are structured not by a central processing unit but by

an interlinked network of simple units called

neurons. Rather than being programmed, neural nets

learn to do tasks through a training regimen

in which desired responses to stimuli are reinforced

and unwanted ones are not.

" Though mathematical theorists demonstrated that nets

should be highly effective for certain kinds

of computation (particularly pattern recognition), it

has been difficult for artificial neural networks

even to approach the power of biological systems, " said

Liaw, director of the Laboratory for Neural

Dynamics and a research assistant professor of

biomedical engineering at the USC School of

Engineering.

" Even large nets with more than 1,000 neurons and

10,000 interconnections have shown lackluster

results compared with theoretical capabilities.

Deficiencies were often laid to the fact that even

1,000-neuron networks are tiny, compared with the

millions or billions of neurons in biological

systems. "

Remarkably, USC's neural net system uses an

architecture consisting of just 11 neurons connected

by a mere 30 links.

According to Berger, who has spent years studying

biological data-processing systems, previous

computer neural nets went wrong by oversimplifying

their biological models, omitting a crucial

dimension.

" Neurons process information structured in time, " he

explained. " They communicate with one

another in a 'language' whereby the 'meaning' imparted

to the receiving neuron is coded into the

signal's timing. A pair of pulses separated by a

certain time interval excites a certain neuron, while a

pair of pulses separated by a shorter or longer

interval inhibits it.

" So far, " Berger continued, " efforts to create neural

networks have had silicon neurons transmitting

only discreet signals of varying intensity, all clocked

the way a computer is clocked, in beats of

unvarying duration. But in living cells, the temporal

dimension, both in the exciting signal and in

the response, is as important as the intensity. "

Berger and Liaw created computer chip neurons that

closely mimic the signaling behavior of living

cells — those of the hippocampus, the brain structure

involved in associative learning.

" You might say, we let our cells hear the music, "

Berger said.

Berger and Liaw’s computer chip neurons were combined

into a small neural network using

standard architecture. While all the neurons shared the

same hippocampus-mimicking general

characteristics, each was randomly given slightly

different individual characteristics, in much the

same way that individual hippocampus neurons would have

slightly different individual

characteristics.

The network created was then trained, using a procedure

as unique as the neurons — again taken

from the biological model, a learning rule that allows

the temporal properties of the net connections

to change.

The USC research was funded by the Office of Naval

Research; the Defense Department’s

Advanced Research Projects Agency; the National Centers

for Research Resources, and the

National Institute of Mental Health. The university has

applied for a patent on the system and the

architectural concepts on which it is based.

A demonstration of the Berger-Liaw Neural Network

Speaker-Independent Speech Recognition

System can be found on line at

http://www.usc.edu/ext-relations/news_service/real/real_video.html

EM.BERGER99

University of Southern California News Service

3620 South Vermont Avenue, Los Angeles, CA 90089-2538

Tel: Fax:

Email: news_service@...

WWW: http://uscnews.usc.edu

Link to comment
Share on other sites

  • 30 years later...

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...