Mozilla's Common Voice: The immediate future of human-machine interaction lies in voice control with smart speakers, home appliances and phones listening to commands to do them.
However, voice assistants, Amazon's Alexa or Apple's Siri, represent overwhelming white, male developers who seem to have racial prejudices.
For example, if you have a strange pronunciation, or your native language is not English, chances are you will never get what you are looking for.
To solve this issue, Mozilla, a free software community, created 2017's "Common Voice", a tool that gathers voices as data sets to create a different AI that represents the global population, not just the West.
Common Voice works by publicly releasing an ever-increasing dataset. So every company can use this data to research, create and train their own voice applications, improving voice recognition for everyone, regardless of language, gender, age or pronunciation.
Currently, there are more than 2.400 hours of voice data and 29 languages (English, French, German, Chinese, and Kabyle.)
"Existing speech recognition services are only available in languages that are economically profitable," Kelly Davis, head of Mozilla's Machine Learning, told TNW.
Speech is becoming the preferred way to interact with technology, and this has been helped by the development of Amazon (Alexa) and Google Assistant news services.
These voice assistants have overturned the way we communicate with technology, however, the innovative potential of this technology is largely untapped because developers, researchers and startups around the world dealing with voice recognition technology are facing one problem: the inability to provide voice data in many languages for the training of speech-to-text engines, "Davis explains.
Although Davis believes that AI is beginning to improve, though slowly, they are far from where they need to be. At the end of 2017, Amazon he added an Indo-English pronunciation to Alexa, allowing her to pronounce Indian phrases and understand some Indian voice shades.
But the voice assistant is still very much in the West, as six of the seven languages he uses are European.
At the beginning of 2018, Google announced support for Hindi to its voice assistant, but the capability was limited to a few questions. A few months after its initial release, Google updated the feature so that Google Assistant can now chat in Hindi - the third most spoken language in the world.
"To a large extent, efforts to address the AI gap have fallen into non-corporate hands," Davis said.
For example, the project Black In AI, looking for ways to integrate non-Western voice features into AI, was started by former Google employees at 2017.
However, it did not begin as a formal extension of the company's work. It began to address what they saw as a primary need in the community.
Davis claims that there is little benefit from voice recognition technology right now.
"Think of how speech recognition could be used by minority language speakers to allow more people to access the technology and services that the Internet can offer, even if they have never learned to read."
"The same is true for people with visual impairments or the disabled, but today's market doesn't seem to be able to help them."
The Common Voice project hopes to accelerate the process of data collection in all languages and around the world, regardless of pronunciation, gender or age.
"By making this data available - and by developing a speech recognition mechanism (the Deep Speech project) we can empower entrepreneurs and communities to tackle existing gaps," Davis added.
If you want to help differentiate Common Voice project voice recognition, make a recording and try to read suggestions or listen to other recordings. Then, just verify that they are accurate.