Talking PCs? Talk to the hand

ANALYSIS

Being able to chat with a computer in plain English has been the standard fare of science fiction for decades, and yet, despite many promises from forecasters and other experts, we're still a long way from turning fantasy into fact.

Voice synthesis has been around for a long time. Bell Labs demonstrated a computer-based speech synthesis system running on an IBM704 in 1961, a demonstration seen by the author Arthur C. Clarke, giving him the inspiration for the talking computer HAL9000 in his book and film "2001: A Space Odyssey".

Forty-five years later, voice synthesis technology can be found in products as diverse as talking dolls, car information systems and various text-to-speech conversion services such as the one recently launched by BT. Many of these modern systems can convert text into a computer synthesised voice of quite respectable quality.

However, the problems faced by voice technology developers primarily lie not in getting a computer to talk, but in getting it to listen. Voice recognition has turned out to be a much harder task than researchers realised when work began on the problem over forty years ago. However, limited voice recognition applications are starting to creep into everyday use, voice input telephone menu systems are now commonplace, speech-to-text dictaphones are increasingly used for note-taking by doctors and lawyers, and voice input has started to appear in computer games systems.

The success of some of these limited-application voice recognition systems has recently prompted the big software heavyweights, Microsoft and IBM, to make further investments. IBM has hired more than a hundred extra speech technology researchers, with the aim of developing a system capable of matching the human level of speech recognition by 2010. And Bill Gates recently said that "we [Microsoft] aim to have computer systems capable of matching a human level of speech recognition by 2011".

If these predictions are true, then it means that within five years we could see the science fiction writers' vision of speech interaction with computers become a reality. However, there are still a lot of technological hurdles to overcome; to understand what these are, we need to delve further into the technology.

Speech synthesis
Speech synthesis, or Text to Speech (TTS) systems all consist of two parts, the front end which converts the text file into a "symbolic linguistic representation", and the back end which takes this symbolic representation and converts it into a speech waveform.

The front end first converts things like numbers and abbreviations into their written word equivalents to produce a normalised text. The next step is to phonetically transcribe each word, and divide the text into prosodic units such as phrases, clauses and sentences. The trouble is that text is full of words that are pronounced differently depending upon the context in which they are used, and this has required the development of sophisticated heuristic techniques that look at neighbouring words and statistics of frequency of occurrence in order to guess the proper pronunciation. The sequence of phonemes is then produced using either a dictionary or a rule-based approach.

The development of the front end speech synthesis system has been...

Post your comment

In order to post a comment you need to be registered and logged in

Log in or create your ZDNet UK account below

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Membership FAQ

ZDNet UK Live

chaycon1

Online Fiber Optic Certification Join a talented group of professionals, who are dedicated to Fiber Optic Networking technology. The online course...

15 minutes ago by chaycon1 on BT launches 40Mbps fibre-based broadband
chaycon1

Online Fiber Optic Certification Join a talented group of professionals, who are dedicated to Fiber Optic Networking technology. The online course...

17 minutes ago by chaycon1 on Google to build gigabit broadband to the home
J.A. Watson

Hi Dava, I'm glad to hear from you, and glad that you see things from the other side. I think that is the most important point of the whole...

42 minutes ago by J.A. Watson on Ubuntu 10.04 (Lucid Lynx) and the Latest Tempest
dava4444

please please please please please please kill that spam bot.

1 hour ago by dava4444 on ZDNet UK: faster, smarter, still IT all the way
253chelisa253

hi

2 hours ago by 253chelisa253 on How security will look in 10 years
lezlow

it is only greedy[microsoft]?

3 hours ago by lezlow on Researchers break into BitLocker
dava4444

it didn't post the link it's 'Ubuntu 10.04 Lucid Lynx Beta-1 First Look' on youtube :) Dava

4 hours ago by dava4444 on Ubuntu 10.04 (Lucid Lynx) and the Latest Tempest
dava4444

Hi James I disagree, Ubuntu needs a GUI update and this one IMO is quite good. your pics show a low res. here's a high res. on YouTube* The...

4 hours ago by dava4444 on Ubuntu 10.04 (Lucid Lynx) and the Latest Tempest
dava4444

Hi any news on the comment bot? knocking me back from my own blog is a bit cheeky lol *Mulder to Scully* "I think it has an agenda.." I know, I...

5 hours ago by dava4444 on ZDNet UK: faster, smarter, still IT all the way
benny boy

if you look at the Brentwood exchange on samknows it servers 21,000 residential propertiesm, Lowestoft serves 31,000! Come on BT sort yourselves...

6 hours ago by benny boy on BT fibre broadband coming to 69 more towns
pbreddit

[programming] H.264 - a sting in the tail http://reddit.com/bfu4q [zdnet.co.uk]

reddit

H.264 - a sting in the tail [programming] 13 points, submitted by zigzag [zdnet.co.uk] http://reddit.com/bfu4q

cybfor

Malware infects second Vodafone HTC phone: [zdnet.co.uk] A second Android-based HTC Magic from Vodafone has been... http://dlvr.it/KhKx

miyabi81

Chatter preview http://www.zdnet.co.uk/news/application-development/2010/03/17/salesforce-opens-up-chatter-developer-preview-40088348/

cybfor

US gov t considers undercover social networking: [zdnet.co.uk] The Obama administration has considered sending... http://dlvr.it/Kh3L

sudipta_vodafone

Please give me chance in the vodafone essar Ltd as back office executive

12 hours ago by sudipta_vodafone on Vodafone culls 375 'mainly back-office' jobs
sudipta_vodafone

I want to get a back office job in vodafone direct payroll

12 hours ago by sudipta_vodafone on Vodafone culls 375 'mainly back-office' jobs
Xwindowsjunkie

I also find it harder to use. It used to scale properly in Firefox. Text would size up and down without dragging all the right edge debris with it....

16 hours ago by Xwindowsjunkie on ZDNet UK: faster, smarter, still IT all the way
dava4444

that comment bot is a nutter, it just referred me to the moderator on my own blog. shocked look. please help thank you Dava I'm afriad to...

19 hours ago by dava4444 on Welcome to the new ZDNet UK community!
dava4444

Hi Rupert! Don't think I could fill the above shoes... but if your ever looking for a consumer rights Tech blogger..tip me the wink lol peace Dava

21 hours ago by dava4444 on Fancy working for ZDNet UK?

Featured white papers

Achieving PCI Compliance for:Privileged Password Management & Remote Vendor Access

For multi-store outlets, including retail, banking, grocery, gas, hospitality, convenience stores and others, reducing (or avoiding) the cost of in-store system support and maintenance while maintaining compliance with PCI and other requirements has become a strategic challenge.

Download now

Web 2.0 Security Threats: How to Protect Your Enterprise Network

Speaker: Dr. Chenxi Wang, Principal Analyst, Security and Risk Management, Forrester Research, Inc. As Enterprises are increasingly connected to the Internet and as hard organizational boundaries are fast disappearing, security professionals are facing fresh challenges in Enterprise computing.

Download now

MindManager - Tutorial for New Users - Short

This tutorial is for new MindManager users and teaches you how to get started, by creating maps, reading maps and organizing your information.

Download now