Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Speaker Localization for Teleconferencing

Status
Not open for further replies.

DrStu

New Member
Hi All,

I'm starting on this topic for my thesis and just wanting to get your views/opinions/suggestions to it.

Objective
Basically what I aim to acheive is to locate the position of a speaker within a room (from his speech).

Approach
From what I've read so far, my proposed method of approach is by using microphone arrays. Since a simple array would be able to estimate the direction of a speech source, I was thinking that if i could get multiple microphone arrays located along the walls of the room, then, by triangulation (finding the intersection of each beam of the microphone array), I would then get the location of the speaker.

Improvement
If that is all done, I would then implement some kind of "sensor localization" thing, so as to enable users to arbitarily place each microphone arrays around the room and the computer can then automatically know the locations of each of these microphone arrays.


This seems to be a tough DSP topic, and my DSP knowledge is pretty limited (only one basic DSP course so far). I am not aiming to develop a new localization algorithm, but just to implement one of it (which would work well in my condition).

Any ideas/feedback on this? Thanks a great bunch~ :)
 
It would seem that you can locate the speaker simply by the intensity of the speech in each microphone. He would be closest to the mic with the highest intensity.
 
Reverberation in a room causes localization due to sound intensity not to work properly. The distance is not far.
You would need a mic for every person mounted on the table directly in front of him, not on the walls nor on the ceiling.
 
It would seem that you can locate the speaker simply by the intensity of the speech in each microphone. He would be closest to the mic with the highest intensity.

Thanks :) But that would only tell me which mic is closest to the speaker, I am aiming for the position of the speaker in the room
 
Reverberation in a room causes localization due to sound intensity not to work properly. The distance is not far.
You would need a mic for every person mounted on the table directly in front of him, not on the walls nor on the ceiling.

That's where the microphone array thing comes in, supposedly (from what I've read) with multiple microphones, the speech signal would arrive at each microphone at a different time. Various algorithms could be applied to acheive that (MUSIC, cross corelation etc). Thus the direction of arrival of the speech signal can be determined.

Is it feasable?

PS: Just for reference, Voice Tracker Array Microphone - Acoustic Magic This product should have got the "directioni detection" part working well.
 
Last edited:
At a fairly short distance from a microphone, the intensity of the echoes equal or exceed the intensity at the mic that is the closest to the person speaking.

The circuit analysing the arrival time of sounds then guessing that the "first" arrival comes from the direction of the person speaking is also messed up by room echoes because it will not know that the person has finished speaking and the "new" arrivals are echoes of previous speaking.

Did you hear the demo of the VoiceTracker? Can you understand what the guy is saying? His voice and the room echoes are very chopped and "digitized". The teleconference systems I made sounded much, much better.

Teleconference systems use a digital echo canceller to reduce the loudness of long distance echoes. They make a "model" of the room then cancel received sounds from being transmitted back to the originating end. They mess up voices when they are working at their limit.
 
Thanks for your reply audioguru. Oh so you've actually made a teleconference system! Must be experienced in this~

From what you said the main problem would be the reverberation/echoes, which would mess up the detection of direction. Would the use of multiple microphone arrays (nb each array consist of multiple mics) situated around the room help improve this?

Also stumbled across this https://www.youtube.com/watch?v=w2u10xUDzKY&feature=relatedwhich is what i aim to acheive (Of course, a less sophisticated model of that :)
 
Last edited:
The "speaking person direction detector" worked extremely well. But the sound quality was horrible due to the room reverberation (echoes) because the microphone was too far away and the very strong accent of the person speaking.

Company and banks head offices spent a lot of money on teleconference systems that did not work due to room reverberation.
I fixed them.
One had a Shure automatic microphone mixer that was supposed to turn on only the mic that was nearest the person speaking. It did not work properly. Frequently the mic on the other side of the table turned on. Between words it would switch off the mic near the person speaking and turn on a mic where somebody was making sounds. It only detected vowels so the first spoken word of each paragraph was not intelligible.
 
Oh i see,

Guess my thread title was misleading, but it's just because my thesis is part of a larger project for improving teleconferencing. Basically i am just trying to acheive the speaker localization part, and nothing of the quality of the speech acquired etc. (those are being worked on seperately).

So for my part, Would the localization part work considering effects of the nasty room reverberation , or is there anything i should look out for, good algorithms to suggest etc?
 
The "speaking person direction detector" works perfectly. Make one if you can.
It is too bad that the designers do not know how to make the speech sound good.
 
Sure I will try it out. Appreciate your feedbacks audioguru.

Also although it may seem like a dumb question, but what are the methods of acquiring the multiple mic input (say, 8 mics) for further processing? I'm familiar with using the sound card to acquire signals from a single mic, but how to do it with multiple mics simultaneously?
 
Multiple mics pickup multiple background noises and multiple echoes.
It is best to switch on only one mic that is close to and is in front of the person speaking.
 
I understand that. :)

But say I want to develop a microphone array for testing (which naturally would have more than one mic), collect the signals from them and use MATLAB/LABVIEW in PC to process. How would i connect (and be able to read) this many mics from the pc?
 
Last edited:
Some personal computers have only two mic inputs for stereo. Some have only one mic input.
 
yea that's why i was wondering about this, is there a way to get around this? Would a data acquisition card or something like that work?
 
Status
Not open for further replies.

Latest threads

New Articles From Microcontroller Tips

Back
Top