voice recognition software and solutions – L&H Voice Express (2000)
For decades people have fantasised about talking to machines. Television and film play their part in this, with the talking computer now much a science fiction cliché. It is here that we find computers listening, thinking and even able to translate from one Martian dialect to some other. If the reality is rather different, like machines needing a regular reboot to keep them sane, it’s easy to underestimate how the technology is progressing.
What brings this idea to the fore is the release of version 4.0 of Voice Xpress, a brand of voice recognition software I am learning to use. Talking as fast and as clearly as one can, the machine types the words, lets me correct them via voice control and all in a way that is entirely hands free. The prospect of another fantasy where you put your feet up and talk to the computer seems almost realisable.
Voice Xpress comes from Lernout and Hauspie and can recognise continuous speech after training it for five to ten minutes. Working at speeds of 130 wpm or more than the fastest typists, it seduces almost without fail. Unlike discreet voice recognition packages that require a less powerful computer to process individual words, Voice Xpress features technology that uses the likelihood of one word following another to generate impressive results. Its contextual ‘thinking’ lets one say ‘you need something close to a 200MHz Pentium but faster is much better’ to discover that the number and unit ‘200 MHz’ are understood and typed correctly. It takes a little more experience to stop it acting just a bit too clever and interpreting ‘five to ten’ as ‘9.55’. Nevertheless here is a tool that after a day’s use works well enough. Given the persistence to learn its canny ways, it promises to be invaluable. It bodes well for the kinds of services that will evolve from voice recognition systems already in use in telephone banking, switchboard services and increasingly consumer call centres.
Learnout & Hauspie’s product is but a shop window onto a store of language technology solutions for the most diverse of markets. It is here we find a slew of tools for automating proofreading, summarising and even translation.
At Yahoo.com, the website most famous for its search features, L&H has signed a deal to use human and machine translation to localise the services in languages such as Swedish, Japanese and Korean. As well as a search engine, the Yahoo site also offers mail, auctions and finance and it’s significant that language technology is now at work helping to roll out a service on a global scale with great speed.
The Belgium based company developed ‘iTranslator’, the machine translation software that enables on the fly translation of text on web pages. Its claim is to be able to publish information on a web site in multiple languages, and translate web pages as you surf. As a corporate solution it also deals with tricky stuff such as multi-lingual searching and summarisation. With specialised lexicons for legal, finance, medicine and many other industries, here is some software that clearly likes a challenge. For the professional, there’s the shrink-wrapped Power Translator Pro which turns document text into draft quality translations of French, German, Italian to or from English. What’s more it translates as fast as seventeen words a second.
For the television, animation and games industry L&H offer SpeechSynchToolkit – an offline development tool that matches the sounds of spoken language with movements of the mouth. Using speech recognition technology based on phonemes, the individual elements of speech, it provides a graphical editing environment that aligns mouth positions with spoken text. Another developer’s solution is TTS 3000/M, which turns computer text into intelligible human-sounding synthetic speech. Based on algorithms that store actual human voice segments, and using L&H ‘s language analysis it offers intelligent pronouncian of text without the nasal, machine-like reading we’ve become accustomed to.
Yet more products, for consumers and developers, focus on the voice control of computer applications. A good measure of the fun of this can be found in the professional version of Voice Xpress has countless commands that allow one to open, save and print documents and also to control Microsoft PowerPoint, Excel, Internet Explorer and even Outlook the email, address book and diary application. All sorts of formatting commands – like ‘bold that’, ‘make it red’ or ‘make it 16 point Arial’ can actually be spoken. A macro language allows the creation of single commands to perform actions to call up a particular Web site or to open Microsoft Outlook and with a ready-addressed e-mail message. With practice, patience and a reasonably quiet environment some of those science fiction fantasies are excitingly close. Was I leaving school just now I’d think carefully about training as a stenographer or an interplanetary translator.
Learnout & Hauspie (Ireland) Ltd, 71-73 Rock Road, Blackrock, Co Dublin.
Product: Voice Xpress comes in Standard, Advanced, Professional and Mobile editions. Common features include a 260,000 word UK vocabulary; XpressStart training in around five minutes; integration with Microsoft Office. Recommended systems include Pentium II with 96 Mb memory and 200Mb hard disc space. Prices start at UK £39.99 Web www.lhsl.com