Speaker Localization for Teleconferencing

DrStu · Aug 3, 2010

Hi All,

I'm starting on this topic for my thesis and just wanting to get your views/opinions/suggestions to it.

Objective
Basically what I aim to acheive is to locate the position of a speaker within a room (from his speech).

Approach
From what I've read so far, my proposed method of approach is by using microphone arrays. Since a simple array would be able to estimate the direction of a speech source, I was thinking that if i could get multiple microphone arrays located along the walls of the room, then, by triangulation (finding the intersection of each beam of the microphone array), I would then get the location of the speaker.

Improvement
If that is all done, I would then implement some kind of "sensor localization" thing, so as to enable users to arbitarily place each microphone arrays around the room and the computer can then automatically know the locations of each of these microphone arrays.

This seems to be a tough DSP topic, and my DSP knowledge is pretty limited (only one basic DSP course so far). I am not aiming to develop a new localization algorithm, but just to implement one of it (which would work well in my condition).

Any ideas/feedback on this? Thanks a great bunch~

crutschow · Aug 3, 2010

It would seem that you can locate the speaker simply by the intensity of the speech in each microphone. He would be closest to the mic with the highest intensity.

audioguru · Aug 3, 2010

Reverberation in a room causes localization due to sound intensity not to work properly. The distance is not far.
You would need a mic for every person mounted on the table directly in front of him, not on the walls nor on the ceiling.

DrStu · Aug 4, 2010

crutschow said:
It would seem that you can locate the speaker simply by the intensity of the speech in each microphone. He would be closest to the mic with the highest intensity.

Thanks

But that would only tell me which mic is closest to the speaker, I am aiming for the position of the speaker in the room

DrStu · Aug 4, 2010

audioguru said:
Reverberation in a room causes localization due to sound intensity not to work properly. The distance is not far.
You would need a mic for every person mounted on the table directly in front of him, not on the walls nor on the ceiling.

That's where the microphone array thing comes in, supposedly (from what I've read) with multiple microphones, the speech signal would arrive at each microphone at a different time. Various algorithms could be applied to acheive that (MUSIC, cross corelation etc). Thus the direction of arrival of the speech signal can be determined.

Is it feasable?

PS: Just for reference, Voice Tracker Array Microphone - Acoustic Magic This product should have got the "directioni detection" part working well.

audioguru · Aug 4, 2010

At a fairly short distance from a microphone, the intensity of the echoes equal or exceed the intensity at the mic that is the closest to the person speaking.

The circuit analysing the arrival time of sounds then guessing that the "first" arrival comes from the direction of the person speaking is also messed up by room echoes because it will not know that the person has finished speaking and the "new" arrivals are echoes of previous speaking.

Did you hear the demo of the VoiceTracker? Can you understand what the guy is saying? His voice and the room echoes are very chopped and "digitized". The teleconference systems I made sounded much, much better.

Teleconference systems use a digital echo canceller to reduce the loudness of long distance echoes. They make a "model" of the room then cancel received sounds from being transmitted back to the originating end. They mess up voices when they are working at their limit.

DrStu · Aug 4, 2010

Thanks for your reply audioguru. Oh so you've actually made a teleconference system! Must be experienced in this~

From what you said the main problem would be the reverberation/echoes, which would mess up the detection of direction. Would the use of multiple microphone arrays (nb each array consist of multiple mics) situated around the room help improve this?

Also stumbled across this https://www.youtube.com/watch?v=w2u10xUDzKY&feature=relatedwhich is what i aim to acheive (Of course, a less sophisticated model of that

audioguru · Aug 4, 2010

The "speaking person direction detector" worked extremely well. But the sound quality was horrible due to the room reverberation (echoes) because the microphone was too far away and the very strong accent of the person speaking.

Company and banks head offices spent a lot of money on teleconference systems that did not work due to room reverberation.
I fixed them.
One had a Shure automatic microphone mixer that was supposed to turn on only the mic that was nearest the person speaking. It did not work properly. Frequently the mic on the other side of the table turned on. Between words it would switch off the mic near the person speaking and turn on a mic where somebody was making sounds. It only detected vowels so the first spoken word of each paragraph was not intelligible.

DrStu · Aug 4, 2010

Oh i see,

Guess my thread title was misleading, but it's just because my thesis is part of a larger project for improving teleconferencing. Basically i am just trying to acheive the speaker localization part, and nothing of the quality of the speech acquired etc. (those are being worked on seperately).

So for my part, Would the localization part work considering effects of the nasty room reverberation , or is there anything i should look out for, good algorithms to suggest etc?

audioguru · Aug 4, 2010

The "speaking person direction detector" works perfectly. Make one if you can.
It is too bad that the designers do not know how to make the speech sound good.

DrStu · Aug 6, 2010

Sure I will try it out. Appreciate your feedbacks audioguru.

Also although it may seem like a dumb question, but what are the methods of acquiring the multiple mic input (say, 8 mics) for further processing? I'm familiar with using the sound card to acquire signals from a single mic, but how to do it with multiple mics simultaneously?

audioguru · Aug 6, 2010

Multiple mics pickup multiple background noises and multiple echoes.
It is best to switch on only one mic that is close to and is in front of the person speaking.

DrStu · Aug 6, 2010

I understand that.

But say I want to develop a microphone array for testing (which naturally would have more than one mic), collect the signals from them and use MATLAB/LABVIEW in PC to process. How would i connect (and be able to read) this many mics from the pc?

audioguru · Aug 6, 2010

Some personal computers have only two mic inputs for stereo. Some have only one mic input.

DrStu · Aug 6, 2010

yea that's why i was wondering about this, is there a way to get around this? Would a data acquisition card or something like that work?

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Speaker Localization for Teleconferencing

DrStu

New Member

crutschow

Well-Known Member

audioguru

Well-Known Member

DrStu

New Member

DrStu

New Member

audioguru

Well-Known Member

DrStu

New Member

audioguru

Well-Known Member

DrStu

New Member

audioguru

Well-Known Member

DrStu

New Member

audioguru

Well-Known Member

DrStu

New Member

audioguru

Well-Known Member

DrStu

New Member

Similar threads

Latest threads

New Articles From Microcontroller Tips