Skip to 0 minutes and 5 seconds Increasingly, our voices are not only used to communicate with one another, but also with technological devices - like your mobile phone or computer. This is something that no doubt many of you do on a daily basis - you may ask your computer to search for something online, or speak to your phone to get directions. Many homes too now have some form of voice assistant, like the Amazon Alexa or the Google Assistant. These devices can assist us with everything from scheduling meetings, to controlling security measures in our homes. So it’s really important that these devices can understand what we’re saying. But, if our voices are so diverse, and constantly subject to variation, how can these devices understand us at all?
Skip to 0 minutes and 42 seconds According to a recent study by Rachael Tatman, the technology used by YouTube to generate captions of what people say in videos can sometimes really struggle. It seems that this speech recognition system finds it much more difficult to accurately recognise what someone is saying when dealing with regional varieties of English. What’s more, this system also performs worse on female speech. In other words, there are broad groups of people whose speech is not being accurately interpreted by the system. Perhaps this is something you yourself have experienced in other areas of your life - for instance, searching for one thing online only to be directed to something else entirely. Research findings like these have serious implications wherever automatic speech recognition technology is used.
Skip to 1 minute and 23 seconds It’s now increasingly common for banks to ‘validate’ your identity over the phone. They do this by comparing a sample of your voice that you’ve provided earlier, to the voice of whoever tries to access your online banking. If the system is not fully accurate someone might struggle to prove their own identity - or, even more worryingly, someone else could gain access to their private information. For our purposes in this course, another thing to consider is what these findings tell us about the attitudes of the designers of voice technology. It’s possible that the makers of this technology have not considered just how much language varies from person-to-person or across social groups.
Skip to 1 minute and 57 seconds Perhaps these devices were ever only tested on a fairly restricted group of speakers? In future, there is a need for sociolinguists to work with the designers of voice technology products, to better understand the attitudes which might ultimately affect the usefulness or trustworthiness of these devices.
Impact of language attitudes on the design of voice technology
How do views on language and accent impact the design of voice technology?
Watch Dr Dominic Watt explain more - feel free to share your thoughts below.
© University of York