- Dec 31, 2001
- 1,260
- 0
- 76
So, some of y'all may remember me showing off a little video of song selection in Winamp by way of a dialogue system back from 2 years ago or so...
Well I'd thought I'd give ATOT people a sneak peak of what's to come (I started a company off of this work). The current version of the application is 10 times faster than the version you all saw two years ago, we now can do backtracking and best of all, it uses up 1/25th the amount of RAM as the old version and runs 24/7 where as the old version used to crash in an hour.
The new demo, talking to your stereo
Oh, and as for these
Gracenote people that seemed to announce that their trying to do the same thing... well one of my buddies knows one of the high up people at gracenote... and we'll be talking soon
Oh, and this will be working for your ipod and car navigation system soon too.
----------------------
Features you'll see in the video if you look closely:
1. resolving confusion. There are a couple times I ask for a song name by the wrong artist, and so the system prompts me for that song I asked PLUS all the songs by the artist I asked. There is another example of prompting me when I have two songs with the same title but by different artists (Yes, I know Roger Waters is ex-Pink Floyd, but that is a live version by him).
2. Dealing with lots of noise... there are some parts that are really noisy, like when I ask for the beatles song, I do have to repeat myself once, but the system doesn't get a single utterance wrong! This is on a database of over 1000 songs. I too can't stand that text to speech voice for too long, thankfully we can tell it to shut up. There WILL be better Text to speech voices in the commercial product.
3. The system can tutor you on how to use it when it launches. A "dialogue" can also be used on launch to set up user preferences.
------------------
Other features we have now, but not shown in this video:
1. Nesting of queries. If I said "Play foxtrot" and then after the responses come with a lot I can say "Frank Sinatra" and it will narrow the query to "foxtrots by frank sinatra."
2. Backtracking. You could say "scratch that" or "I didn't mean that..." or orther phrases of that type to undo an action. Backtracking isn't included in music selection due to the simple nature of the task (as compared to car navigation).
-----------------
How's it work? Lots of really complex semantic parsing to determine your sentence structure and it keeps track of what you said before, too. We are the parser, not the speech recognizer.
Cliff notes of above
The system really rocks because it uses semantic parsing and keeps track of the state of the conversation.
Go download the video and let me know if you want to be a beta tester in the near future.
Well I'd thought I'd give ATOT people a sneak peak of what's to come (I started a company off of this work). The current version of the application is 10 times faster than the version you all saw two years ago, we now can do backtracking and best of all, it uses up 1/25th the amount of RAM as the old version and runs 24/7 where as the old version used to crash in an hour.
The new demo, talking to your stereo
Oh, and as for these
Gracenote people that seemed to announce that their trying to do the same thing... well one of my buddies knows one of the high up people at gracenote... and we'll be talking soon
Oh, and this will be working for your ipod and car navigation system soon too.
----------------------
Features you'll see in the video if you look closely:
1. resolving confusion. There are a couple times I ask for a song name by the wrong artist, and so the system prompts me for that song I asked PLUS all the songs by the artist I asked. There is another example of prompting me when I have two songs with the same title but by different artists (Yes, I know Roger Waters is ex-Pink Floyd, but that is a live version by him).
2. Dealing with lots of noise... there are some parts that are really noisy, like when I ask for the beatles song, I do have to repeat myself once, but the system doesn't get a single utterance wrong! This is on a database of over 1000 songs. I too can't stand that text to speech voice for too long, thankfully we can tell it to shut up. There WILL be better Text to speech voices in the commercial product.
3. The system can tutor you on how to use it when it launches. A "dialogue" can also be used on launch to set up user preferences.
------------------
Other features we have now, but not shown in this video:
1. Nesting of queries. If I said "Play foxtrot" and then after the responses come with a lot I can say "Frank Sinatra" and it will narrow the query to "foxtrots by frank sinatra."
2. Backtracking. You could say "scratch that" or "I didn't mean that..." or orther phrases of that type to undo an action. Backtracking isn't included in music selection due to the simple nature of the task (as compared to car navigation).
-----------------
How's it work? Lots of really complex semantic parsing to determine your sentence structure and it keeps track of what you said before, too. We are the parser, not the speech recognizer.
Cliff notes of above
The system really rocks because it uses semantic parsing and keeps track of the state of the conversation.
Go download the video and let me know if you want to be a beta tester in the near future.