Montag, 13. September 2010

Application centric speech recognition for your desktop: simon 0.3.0 released

The new version 0.3.0 of the open source speech recognition simon has been released and boasts the all new scenario system allowing you to build your own customized speech recognition system with just a few mouseclicks.

With simon you can control your computer with your voice. You can open programs, URLs, type configurable text snippets, simulate shortcuts, control the mouse and keyboard and much more.

Because of simons architecture, it is not bound to a specific language and can be used with any dialect. It is also specifically designed to handle speech impairments which makes simon a viable alternative to conventional input methods especially for physically disabled people and senior citizens.

simon is based off the open source large vocabulary continuous speech recognition engine Julius.

New in simon 0.3

simon 0.3 introduces an application centric approach to speech recognition by using packaged use cases of the speech recognition called "scenarios". Scenarios contain the complete configuration for one specific task like controlling Firefox or using the voice controlled on screen keyboard. These scenarios can then be shared with other simon users and are collected in a central online repository which can be accessed directly from within the application.

Besides the scenario system the new version also provides the user not only with the possibility of creating his own model through training but also to use an existing acoustic model (base model) to get started even quicker - entirely without training.If the user wants more control or would like to improve recognition accuracy, personalized training is possible through the optional HTK (not included in simon due to license restrictions). simon then offers to adapt the used base model to your own voice or to create a new model entirely from scratch.

Additionally, we have been working hard to make simon even easier to use. Some of the more notable results of these efforts are the new introductory wizard that guides you through the initial setup as well as the speech model generation adapter that automatically fix a vast variety of common beginners mistakes for you.

Furthermore simon 0.3 introduces three new applications to the suite. Sam, an acoustic modeling tool is geared towards professionals who want to tinker with their speech model and get the best recognition out of it. It is also a great tool to create and test large models which can then be distributed as base models for other simon users. To create base models you also need a lot of speech data which can be easily collected through the newly introduced combo of ssc and sscd. ssc stands for simon sample collector and is the client to the sscd server. Together they provide a powerful, cross platform tool to collect samples from lots of different speakers - even allowing you to record with multiple microphones and / or sound cards simultaneously.

Demonstration




Readers of the RSS feed: Watch it on Youtube

Download

You can download simon 0.3 as source archive but there are also packages available for Windows, OpenSUSE and Ubuntu on our Sourceforge page. Up to date installation instructions are available on the simon listens wiki.

Kommentare:

Anonym hat gesagt…

Great!
Thanks for this fantastic app.

Edulix hat gesagt…

Hello, I just want to see that Simon seems to be a very nice and useful application. When I get the time I'll dig into the code =)

Anonym hat gesagt…

wow, nice to see how simon is growing. The video is really great and simon seems pretty useful in its current version.

marco hat gesagt…

I have a great problem with simon..it can't recognize the audio input\output device.. using kde 4.5.1..

Peter Grasch hat gesagt…

@all: Thanks for the encouragement. I really appreciate it.

@marco: Are you Mar91 from kde-apps? I'm gonna assume you are.

If the device doesn't show up in either the phonon configuration and in simon (which are two very different things) this is likely an ALSA problem.

Does the device show up in arecord -L?
If it does: Please file a bug on http://bugreports.qt.nokia.com/secure/Dashboard.jspa against QtMultimedia (used by simon for the low level audio i/o).
If it doesn't you might want to contact your distribution / ALSA about your hardware.

If you need further, please contact me at grasch ate simon-listens.org.

Best regards,
Peter

ffejery hat gesagt…

I haven't tried this yet, but I have to say that the "Scenario" concept is one of the best uses of GHNS I have yet seen. This really shows the power of the Open Source community.
*downloading now*

Anonym hat gesagt…

I really like where Simon is going. Even better that it will become part of the KDE SC, hopefully already with 4.6.

What Simon needs, in my eyes, is more testing and publicity. Being part of KDE will ensure this.

It's a great program. Cheers to all its contributors!

Anonym hat gesagt…

Hi there,

I think this application looks fantastic.

If I use this with my Home Theatre PC, will Simon filter out the sound being output by the PC through my speakers?

For example, will Simon work while I am listening to music?

Peter Grasch hat gesagt…

> I think this application looks fantastic.
Thanks :)

> For example, will Simon work while I am listening to music?
Well simon has no special pre-processing noise canceling filters if that is what you mean. However, we have had several reports which especially pointed out how well Julius copes with background noise. This is also in line with our personal experience.

Julius also provides noise removal using spectral substraction (see sscalc here: http://julius.sourceforge.jp/refman/julius.html.en). You can change the julius options simply by editing your .jconf file in ~/.kde/share/apps/simond/models/default/.

If you require further help you can always contact us per mail at support ate simon-listens dot org.

Best regards,
Peter