Speak on the Net

Voice browsing technology
subject of research grant

By Sheeta Leung

Sheeta Leung
The Human-Computer Communications Laboratory

 

Working with computers has not been a good experience for many people. Computers are too cold, robotic, and without any feelings.

If computers could talk to us, it would be fine for these people.

Dr. Helen Meng Mei-ling, associate professor in the Department of Systems Engineering and Engineering Management at The Chinese University of Hong Kong, has been working on speech and language technologies with her research team since 1999.

In December 2001, they were awarded a $5.6 million grant from the Innovation and Technology Fund by the government.

Dr. Meng will spend the grant on developing universal accessibility to Chinese Web content. Dr. Meng said, "The idea was from the newly advanced wireless communications system."

She explained that though portable computers allow people to be online wirelessly, the screens are limited in size.

With speech and language technologies, people can get as much information from the Web as they want.

Apart from this, developing voice browsing technology is another of Dr. Meng's targets.

A voice browsing system is an application of speech generation technologies. It is similar to verbalized search engines on the Internet.

By using a voice browsing system, users speak their inquires to get verbalized search results from the web.

She said that a voice-browsing system should be able to understand a user's speech in various forms.

Accents and expressions such as "ahs" and "ums" are challenging to the system.
Even more challenging is that the computer system needs to "speak" the retrieved results accurately.

As an application of speech generation technology, Dr. Meng said that a voice browsing system must be user-friendly and easy to understand.

"As you know, robotic speech easily loses the listener's attention. Natural speech with rhythms and flexible sentence phrasing is important."

Though voice browsing technology is not yet mature, the research team's CU Forex hotline has applied some speech generation technologies already.

CU Forex is a trilingual hotline for real-time foreign exchange enquiries. It has been operating since 1999.

Dr. Meng hopes her voice browsing system will be beneficial to local industries. "Also it would be most useful to blind people."

Dr. Meng is also working on a related project, an audio search engine. The audio search engine applied speech recognition technology.

She said, "We want to develop a search engine for searching audio information on the Internet in the multi-media age."

In fact, they were the first to develop an audio-based Cantonese search engine in Hong Kong. Similar research abroad mainly relies on English.

According to Dr. Meng, users can "ask" computers for audio files. Then, the audio search engine searches for the audio track of videos with speech recognition techniques.

One of their sponsors, Television Broadcasts Ltd., has provided 146 news video broadcasts for their experiment.

Dr. Meng views the future of speech and language technologies as promising. "After all, Hong Kong has a rich linguistic environment," she said.




Previous