As an Ambassador, I had the opportunity to apply for a loaned Nokia N950 to develop / port applications to MeeGo/Harmattan. I took Nokia up on their offer and the result is simone - a trimmed down, mobile version of simon. In other words: "simon embedded" or "simone".
The client features push to talk or automatic voice activity detection (configurable) and because of simons client / server architecture uses little power on the device itself. Even with voice activity detection running you should get many hours of continuous speech recognition out of a single charge.
On the one hand, simone can be used to replace the headset of a "full" simon installation but also includes a couple of default actions on the device. For example, you can use a voice controlled quick dial feature or start / stop a turn-by-turn navigation.
For more information and a live demo, have a look at the youtube demonstration:
If you can't see the embedded video, try this direct link.



15 Kommentare:
Very nice!
I assume it is written in QT? Would it be hard to port it to other nokia devices? Like the N8? I am sure many users would appreciate simone.
Thanks
Yes, it's written in Qt (the interface is QML).
It should be fairly trivial to port it to a Symbian device...
You can find the code on our repository:
http://speech2text.git.sourceforge.net/git/gitweb.cgi?p=speech2text/speech2text;a=tree;f=simone;h=721c40801c714bc1f5c600031ee73709655f9680;hb=refs/heads/sound
Best regards,
Peter
Great work as always! Thumbs up Peter
Cool, I'll try to check out the code and install it on my laptop today.
Also, I'll discuss with Plasma Active team on using Simon with Plasma Active on tablets :)
@Shantanu: Yes, I heard someone talking about simon on plasma active at the Desktop Summit as well (altough that was mostly joking I guess). If you do bring it up, please keep me in the loop, I'm quite interested in this!
Hi Peter,
I don't suppose you know how this compares to Sirri built into iP4S?
Is it potentially as sophisticated, or is it miles from reaching parity?
I noticed the project's been around for a few years now.
Thanks for any time you can spare.
All the best.
wow this is great ... can Iget a .deb someware or a do you have a repo ... I would love to try it :)
@Jed: Sirri and simon are following different goals. The commands on the phone are really just a little extra to the client - mostly it's going to be used as an input note for a larger simon setup.
@Anonymous: There is no released deb yet as there are still some bugs left and I'm swamped with University right now. But if you want to try it out, I can simply send you a test build to use at your own risk. Just e-Mail us at support at simon-listens.org
Hi Peter,
Thanks for the feed-back!
I don't understand your explanation about the differences.
Could you possibly explain in more detail?
Sorry for the delay in my response.
Unfortunately I didn't get an email once you replied.
All the best.
@Jed: The main difference is that simon allows (and to a certain extend expects) the user to train / adapt the model for his own use. As such, simon is a more personal solution and - because of this - doesn't allow recognition of such a vast array of words (the more generalized a model is, the more training data you have which makes it easier to recognize more words).
Because the recognition is not able to recognize "free" text (or at least a large enough vocabulary that it looks like it) we can't offer the same natural interaction patterns that siri can.
We can, however, offer language and dialect independent recognition.
Keep in mind that most of our target audience has speech impairments and / or pronunciation that differs from the norm (often the case for e.g. elderly people).
I hope that answers your question. If you'd like to find out more about simon, feel free to also get in touch with us per mail at support simon-listens°org.
Best regards,
Peter
@Peter,
Thank-you so much for the in-depth explanation!
That's unfortunate...
I was very excited when I saw the beta version for Maemo/MeeGo.
I was really hoping we might be getting something like Siri.
Do you know of any F/OSS projects that are more like Siri?
I searched everywhere, but all I could find was Simon.
Also....
When do you expect to finish Simon for Maemo6x?*
Do you intend to follow the SwipeUX guidelines?
Will be as functional as the desktop version?
Thanks again!
*meego-harmattan
On Saturday 03 December 2011 17:55:52 you wrote:
> Do you know of any F/OSS projects that are more like Siri?
No, sorry. Creating a general, large vocabulary speech model that'd be required for such a project is very, very time consuming and costly.
You can use google's API (and their internal model) but that's hardly F/OSS...
> When do you expect to finish Simon for Maemo6x?
It really just requires some bug fixing and polishing but as it depends on the git version of simon it's not really "ready" until the next simon version is released. That's also the reason why it's not in the store (even marked as Beta or something).
So really it's a matter of asking when the next simon version is going to be ready and that's hard to answer. As soon as the features that are currently being developed (context dependence mostly) are in, I really, really want to get a release out the door. But that requires a lot of testing, documenting, etc. so it'll take a while, I'm afraid.
if you want to try it right now, I can provide you a deb, though. But be warned: It requires experimental software and is still fairly experimental itself :)
> Do you intend to follow the SwipeUX guidelines?
I've read and tried to adhere to Nokias UI guidelines, yes (for example the switch / checkbox distinction). If you spot something that's inconsistent, please let me know!
> Will be as functional as the desktop version?
No. What you saw in the video is pretty much all that's going to be available on the device. I don't think it makes a lot of sense to provide model training and grammar configuration on a smartphone...
Best regards,
Peter
["No, sorry. Creating a general, large vocabulary speech model that'd be required for such a project is very, very time consuming and costly."]
Aw, bummer man! ;-P
["You can use google's API (and their internal model) but that's hardly F/OSS…"]
What's it called? I'll look into it a bit more…
["So really it's a matter of asking when the next simon version is going to be ready and that's hard to answer. As soon as the features that are currently being developed (context dependence mostly) are in, I really, really want to get a release out the door. But that requires a lot of testing, documenting, etc. so it'll take a while, I'm afraid.
if you want to try it right now, I can provide you a deb, though."]
I will gladly test it, just as soon as the White N9 is available in Australia.
It's like some kind of damn rare unicorn right now!
It's trickling out to stores in Finland now, so hopefully other countries will follow very soon.
It better be here by xmas, or I won't be having a white one (I'm in Qld Australia, so that's rare).
["I've read and tried to adhere to Nokias UI guidelines, yes (for example the switch / checkbox distinction).
If you spot something that's inconsistent, please let me know!"]
I will, just as soon as I can compare a White 64GB to a Black 64GB, then I'll buy, & then the testing/hacking will begin!
["No. What you saw in the video is pretty much all that's going to be available on the device."]
Plus the "context dependence" you're currently finishing for Simon?
["I don't think it makes a lot of sense to provide model training & grammar configuration on a smartphone…"]
LOL, I don't even know what that is, so I guess it shouldn't bother me!
Thanks mate!
Hi Jed!
It looks that the Voice API of Googles Android isn't even really public and intended for general use. Sorry for getting your hopes up. I only had a quick look, tough, so I might have missed something - and even then I found a couple of un-official hacks to use it through the HTML5 element in chrome. Still, probably not the best way to go...
If you get your N9 and we still haven't had time to release it then, feel free to ask for a deb through support simon-listens.org!
The context dependence is intended for the desktop version only at this point. As the grammar won't be large enough I don't think it really makes sense on the device. If we start to see larger applications on the phone that would like to take advantage of that (speech recognition as a service), it'd be of course entirely possible to add that later on.
Best regards,
Peter
Thanks Peter, I will be in touch!
Could be a while....
The White N9 is like a unicorn in Australia :(
Seasons well wishes.
Kommentar veröffentlichen