Just a few tidbits about what I do (demo video inside) - G35Driver

Feb 7, 2005 | 03:43 AM

ajayjuneja

Thread Starter

Registered User

Joined: May 2004

Posts: 426

Likes: 0

From: Mountain View, CA

Just a few tidbits about what I do (demo video inside)

Hey all NorCal G/Z people,

So I told some of you at dinner yesterday evening about what I do -- Natural Language for Car Navi. systems and media players (well, those will be the first two products, there will be others later on).

Video demo (Quicktime)

Video demo (Windows Media Player)

-----------------
The company (Speak With Me -- www.speakwithme.com) is based off of research I did with a Grad student @ CMU -- I've worked on this stuff since 2001. It's a semantic parser, not a speech recognizer. We do utilize a speech recognizer, but fundamentally, we are the parser with extracts the meaning of your phrase. We can adapt to many errors in the speech recognition, which also helps our system respond intelligently when it doesn't understand something.

----------------------
Features you'll see in the video if you look closely:

1. resolving confusion. There are a couple times I ask for a song name by the wrong artist, and so the system prompts me for that song I asked PLUS all the songs by the artist I asked. There is another example of prompting me when I have two songs with the same title but by different artists (Yes, I know Roger Waters is ex-Pink Floyd, but that is a live version by him).

2. Dealing with lots of noise... there are some parts that are really noisy, like when I ask for the beatles song, I do have to repeat myself once, but the system doesn't get a single utterance wrong! This is on a database of over 1000 songs. I too can't stand that text to speech voice for too long, thankfully we can tell it to shut up. There WILL be better Text to speech voices in the commercial product.

3. The system can tutor you on how to use it when it launches. A "dialogue" can also be used on launch to set up user preferences.

------------------
Other features we have now, but not shown in this video:

1. Nesting of queries. If I said "Play foxtrot" and then after the responses come with a lot I can say "Frank Sinatra" and it will narrow the query to "foxtrots by frank sinatra."

2. Backtracking. You could say "scratch that" or "I didn't mean that..." or orther phrases of that type to undo an action. Backtracking isn't included in music selection due to the simple nature of the task (as compared to car navigation).
-----------------

How's it work? Lots of really complex semantic parsing to determine your sentence structure and it keeps track of what you said before, too.
------------------

Cliff Notes
Go download the video and see what is coming to your car stereo in 2007

P.S. If there is a car stereo shop in Cali that would like to sponsor my car so I can afford to put this into my own car faster... let me know -- I will be attending the shows in NorCal and some in SoCal too.

Last edited by ajayjuneja; May 13, 2007 at 03:19 AM.