Announcing what's coming to your car stereo (and ipod, and car navigation system) in 2007....

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

ZeroEffect

Senior member
Apr 25, 2000
916
1
0
the voice is hard to understand at times...

keeping track of songs she listed was hard too...

she sounds like the female version of the voice on Radiohead's
song Fitter Happier ;)

But the execution of the commands is flawless.

good job though, you're on to something here! Good Luck!!
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: logic1485
big :thumbsup: on the software, but one question, or rather comment/quirk: are you going to be using the LH voice modules on your piece of software? i wouldn't mind if you do, but it's just that it's hard to understand it sometimes...also, rather annoyingly roboticl, but i guess there's nothing you can do about that, is there?

So to answer some of your questions -- NO that text to speech voice won't be used in commercial releases, there is another startup from my school called cepstral (www.cepstral) which specializes in making text-to-speech voices, and we'll probably be using several of theirs, which sound better than "LH Michelle."

As for the voice automatically shutting up while I talk, that's what that "don't talk" button is for. I routinely tell it to shut up when I normally use the system, however, I purposely let it talk for the purposes of this video.

Since the recognizer was in an "always listen" mode rather than a "push to talk" mode (this is a user setting -- I've noticed many people like the push to talk cause it reminds them that they are talking to a computer), I can't automatically lower the music volume when you're talking. For a push-to-talk mode I certainly can, and that's one of the advantages of this type.

You could easily select by album too, but since I didn't have Album information for all of my songs, I left that category out of the database.

As for keeping track of songs, I have a debug window that lists all of the songs too... but that is not shown in the video, because it also shows how the dialogue system works with too much detail. So that's how I kept track of them.
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: daniel1113
I am very imrpessed, but I have a few questions:

How well does this work in a driving scenario? For example, combine the music with wind noise, road noise, etc.

Also, how does the system handle odd spellings? For example, you asked the computer to play a song by "Caine." What if the artist's name was "Kaine?" Would the system still find the song?

You can drive about 50 mph with the windows open before you have to close them for the system to still work (this is on non-rainy days. rain adds noise). The system handles odd spellings just fine, it even gets "Gigi D'agostino" correct.
 

RaynorWolfcastle

Diamond Member
Feb 8, 2001
8,968
16
81
IMHO, there should at least be an option for the music volume level to drop when it recognizes that you're starting a command. Like I said, if I'm listening to something loud, I don't want to have to yell over it to finish my command. Great work, BTW :)
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: RaynorWolfcastle
IMHO, there should at least be an option for the music volume level to drop when it recognizes that you're starting a command. Like I said, if I'm listening to something loud, I don't want to have to yell over it to finish my command. Great work, BTW :)

To do that requires more intimate relations with the media player than I have with Winamp. With a commercial system this will definately be the case!
 

elbosco

Senior member
Jul 17, 2004
907
0
71
That's pretty cool. All it needs now is a copy of Demi Moore's voice and I'd be set.
 

PowerMacG5

Diamond Member
Apr 14, 2002
7,701
0
0
Originally posted by: ajayjuneja
Originally posted by: CheapArse
:thumbsup: That is badass

Thank you :) I quit my day job at BeVocal so I could commercialize this.

Actually I wasn't at BeVocal very long, because I TOLD them I would continue working on this project 4 months before I started working there, they said they were fine with that, then 6 weeks AFTER I started they decided they weren't fine with that, so I said "buh-bye."

It was a sign from god letting me know that I should spend all my time on doing what I really wanted to spend all of my time doing. And poof, all cylinders are a firing now.

System specs to run it:

Pentium-Pro 200 Mhz, 64 MB RAM
Windows NT/2000/XP/2003, or Linux.
I am debating a mac mini port (would y'all buy it for the mac mini?)
And I am working on an Intel-XSCALE port for Windows CE & Linux.
Very very impressive. A port to Mac shouldn't be terribly hard if the parser is written in some popular language which there are cross platform compilers available. The only major difference would be the GUI coding, and if you don't feel like learning Objective-C/Cocoa you can find some Mac programmers to do it (like me :p). A lot of people I know are flocking to get Mac Mini's to hook up in there cars (then again, this could be because I go to a tech school). But a Mac port would not be mini specific, it would work great for all Mac's.

I would also be interested in beta testing for you.
 

Kelemvor

Lifer
May 23, 2002
16,928
8
81
Kind of neat but really hard to hear whatever you are saying in the demo. I did find out that my speakers are reversed though. heh heh.
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: FrankyJunior
Kind of neat but really hard to hear whatever you are saying in the demo. I did find out that my speakers are reversed though. heh heh.

I've been noticing that people with crappy speakers (including when I listen to it on my laptop) have problems figuring out what I'm saying.

While on my home system I had no problems hearing everything, except that damn text to speech voice annoys me. I may mix the audio a little differently to attenuate the music some. The problem is, the TTS voice and the music are recorded on the same track.
 

maziwanka

Lifer
Jul 4, 2000
10,415
1
0
as long as the voice is completely different in the commercial release and doesn't keep talking while the music is playing, i'll be happy.

nice job!
 

AmigaMan

Diamond Member
Oct 12, 1999
3,644
1
0
damn that thing is SWEEET!!! You're definitely on to something here and hopefully you'll share your millions with all your ATOT peeps who gave you props during your beginning days... Well hopefully you'll at least make a couple grand off it.

Great job!
 

notfred

Lifer
Feb 12, 2001
38,241
4
0
Does it do anything besides single songs? Can you say "Green Day - American Idiot" and have it play the whole album? What about the fact that the name of the album is also the name of a track on the album? Can you distinguish between the two? How about creating playlists with it? Have you used the "smart playlists" feature of iTunes? Could it do something like that?
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: notfred
Does it do anything besides single songs? Can you say "Green Day - American Idiot" and have it play the whole album? What about the fact that the name of the album is also the name of a track on the album? Can you distinguish between the two? How about creating playlists with it? Have you used the "smart playlists" feature of iTunes? Could it do something like that?



Wow notfred, you really covered all the bases :)

The answers to your questions:

[/i]Does it do anything besides single songs?[/i]

Yes. If you notice I said "Play all my Uri Caine" as one of my queries.

Can you say "Green Day - American Idiot" and have it play the whole album?

Yes, if that information was in the database (I didn't have album as a field for this demo, I can easily add that in).

What about the fact that the name of the album is also the name of a track on the album?

If I made "Album" more important than "Track Name" it would select the Album (and it would select the Track if "track name" was more important than album. This is a user preference). If the two fields were the same priority, the system would ask you whether you just wanted to play the song or if you wanted to play the whole album.

How about creating playlists with it? Have you used the "smart playlists" feature of iTunes? Could it do something like that?

Yes, you can use it to create playlists once I finish writing that feature (gimmie a few more weeks). As for creating "smart playlists" like iTunes, that is more complicated and would require better hooks into the media player to tap in to what you've chosen in the past. I have plans for having the dialogue system more seamlessly choose what you tend to like, but that work is still at a concept stage. But we'll be doing that too :)


Like I said, to do some of the more complicated features, I need to finish my dealmaking first, so I can access more of the media players' functions than I have access to in their public api's.
 

KingNothing

Diamond Member
Apr 6, 2002
7,141
1
0
Good suggestions in this thread, I would add that when the system is talking, it should automatically reduce the volume of the music that's playing. Not a complete mute, just so that the computer voice is easily distinguishable. I had trouble understanding it sometimes when it would talk while the music was playing.
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: KingNothing
Good suggestions in this thread, I would add that when the system is talking, it should automatically reduce the volume of the music that's playing. Not a complete mute, just so that the computer voice is easily distinguishable. I had trouble understanding it sometimes when it would talk while the music was playing.

That's a limitation I have to put up with for now, the text to speech volume and winamp's volume are both going through wave out, so lowering one lowers the other. This won't be the case for a commercial system of course :)

 

Argo

Lifer
Apr 8, 2000
10,045
0
0
You gotta do something about the annoying voice. If I had to listen to it while driving the car I would drive it off the road.
 

WolverineGator

Golden Member
Mar 20, 2001
1,011
0
76
Impressive. I want in on the IPO!

A suggestion:
For multiple copies of the same song (one live the other not, etc.)... When it asks "Which one would you like to hear," I should be able to say "the second one" or "the third one" or whatever instead of saying the entire track name.

Also, is it possible to change the speed of the speech? It's tiringly slow if you're already familiar with the system.
 

ajayjuneja

Golden Member
Dec 31, 2001
1,260
0
76
Originally posted by: WolverineGator
Impressive. I want in on the IPO!

A suggestion:
For multiple copies of the same song (one live the other not, etc.)... When it asks "Which one would you like to hear," I should be able to say "the second one" or "the third one" or whatever instead of saying the entire track name.

Also, is it possible to change the speed of the speech? It's tiringly slow if you're already familiar with the system.

as for your first suggestion, you can already do that :) :) :) :)

Change the TTS speed... hmm I could make that a user preference.

BTW, thanks to all of you for suggestions on making it better / polishing it. Everything you say is being taken into consideration as I truly want this to be the best system possible.
 

nsafreak

Diamond Member
Oct 16, 2001
7,093
3
81
Ajay,

I know you said that this software will work with multiple devices like CD players, iPods, etc. Do you know if it will work with Sirius satellite radio receivers? That's the service I subscribe to and I'd really like to see this kind of functionality available for it. Like it'd be really handy if one of the songs I have on my memory bank comes up and when my radio does a stream alert that this piece of software tells me about the stream alert and what song the alert is for. Do you know if you'll be working with the folks that produce Sirius receivers at all and if this kind of integration will be available/possible?