I’m working this evening, despite it being a holiday. The Cop’s gone to work, so it’s not that bad. Just finished reading Technofutures for the literature review we’re doing at work. Let me tell you, dictating a book report is a hell of a lot easier than typing one. I was dreading the documentation all afternoon, procrastinating as best I could, but now that I’m dictating, well, it’s actually kind of fun. The author of Technofutures mentions that he dictated portions of the book using the same voice recognition software that I use. Very amusing to read that. I’d be tempted to buy stock in the VR company if the user interface were a little more sophisticated, a little more aesthetically appealing. It’s actually kind of retro, and I don’t mean that in a good way.


  1. I miss my speech recognition days. It was the most fun job I’ve ever had.

  2. Is there a lack of money/interest in that technology, Vanessa? When I went looking for software, I didn’t find many companies producing it, and when I did get the stuff I’m now using, I was rather surprised at 1) how inexpensive it is, and 2) how clunky the interface is.

    I would have expected it to be expensive software with a slick interface (a sign to me that a particular technology is “hot”).

  3. Yes and no 🙂

    There are many different things you can do with automatic processing of speech that all fall under the umbrella name of “speech recognition”.

    What you are using is dictation software. This field is NOT hot at all for several reasons:

    – people feel shy talking into a microphone
    – edition of the text that has been transcribed from your voice proves really difficult. What happens if you say “blah blah blah errrr no that was a mistake errr hold on, DELETE, yeah that’s right, blah blah blah and then he told her to delete the last paragraph, NO DON’T DO THAT”
    – people type fast these days so there isn’t a lot of need to dictate stuff anymore. Email has a lot of weight on this too.
    – if you have a cold, it won’t work
    – if there is background noise, it won’t work
    – if you yell or talk faster than usual, most likely it won’t work
    – you need to train the software, which a lot of people find tedious

    I never worked in the dictation systems fields, thank God.

    So what else can speech recognition be used for?

    – call centres. You get the system to play Big Brother and listen in to make sure that the employees follow a script, or that they don’t swear, or you combine the transcription software with some clever information manager software that, google-style, will display the relevant information on their screens as they talk to the clients

    – live transcription of news, used in airports, bars, hospitals and all sort of public spaces where you want to cater for the hearing impaired but can’t afford having a sign language interpreter

    – lip synchronisation of animated films. Matching cartoon’s lip movements to real actors’ voices can be a pain. Speech processing software can find the points in time when the lips are meant to be closed, or the mouth wide open, or there is a plosive…stuff like that. We did work like this for a blockbuster once.

    – identity recognition. But in general, these systems haven’t proved that successful

    – security. Obviously I can’t go into this!

    The company I used to work for did all of the above, but we belonged to a mother company that paid the research division (us) nuggets even though some of the aforemention stuff made them millions of pounds. So eventually I got burned and got out.

  4. Thanks for the insights! I always assume anything to do with language is cool, so I am excited about even the simplest technology for dictation. I guess it helps, too, that I love the sound of my own voice. 😉

