Googles annual Summer of Code.
Following Lydias recommendation on the mailing list, I've decided to showcase some ideas for simon that are not yet taken by any student on this blog for the remainder of the application period: If you'd like to implement one of those ideas, please feel free to send me a mail at grasch ate simon-listens ° org.
The first idea that is still up for grabs is simons voxforge integration. Voxforge is an ambitious project to create free (GPL) speech models for everyone. With the current Voxforge models, simon can already be used without any training at all. Just download simon and the appropriate model from the Voxforge website for your language and start talking to your computer.
This works because the Voxforge models have been trained with lots and lots of voice recordings from people around the world. The resulting model is speaker-independent and works quite well for most people. If you need even more accuracy, just adapt the general model to your voice with a couple of training session and you are ready to go.
The current Voxforge model for English is quite good for command and control but nowhere near powerful enough for dictation. The models for other languages consist of even fewer samples. In the last five years, 624 identified users submitted voice recordings for the English model. Only 50 identified people submitted recordings for the German Voxforge model.
I think this is primarily because donating voice (through the Java applet on the Voxforge homepage) is only done by those who are actively searching for ways to improve open source speech recognition. There is also no immediate pay off for the donators.
simon on the other hand reaches a wide array of people interested in open source speech recognition: More than 24.000 in the past 12 months.
Many of those users train simon to get the most out of their system. But those trainings samples never get submitted to Voxforge to improve the general model because there is no easy way to do that.
I propose to implement an easy to use uploading system that allows the user to submit his training samples directly to the voxforge corpus with the press of a button.
Together with an automatic download of the voxforge model for a selected language when simon is launched for the first time this means that simon users can:
1. Get started with the general model even easier because they don't have to download it manually
2. If the recognition rate is too low, they can (and in our experience often will) train their model locally.
By submitting the recorded samples for the local training back to Voxforge, they not only submit valuable recordings - more often than not they would even submit exactly those recordings that train words that couldn't be recognized with the previous Voxforge model.
And because users can immediately see if their samples are helping or hurting (by checking if the recognition rate improves locally), the generated submissions should be fairly high quality. There is even an immediate advantage for the end-user (their recognition rate improves).
If you are interested on working on this proposal please contact me at grasch ate simon-listens ° org.