Multi Speaker Detection and Tracking using Audio and Video Sensors with Gesture Analysis
Video conferencing plays an important role in many corporate and educational fields. E-learning uses the concept of video conferencing for interaction between students' and tutors' in different locations. The tutor's actual presence is in a real classroom and the students can view their tutor through a video in a virtual classroom. Wireless microphones and video sensors are used, to facilitate an interaction between the students and tutors but sometimes it may not be as efficient when we use multiple speakers. In that case, it would be helpful if we can identify the student who asks a question first, either in the virtual or real classroom by using audio and video sensors. To make an E-learning classroom as similar to a real classroom, we propose a system that will utilize the professor's gestures; this will decide who can ask questions. This is particularly useful when we use several speakers in an E-learning classroom. The student who is asking a question, for the first time, will be located using audio and video sensors in either virtual or real classroom. The raised hand along with student's voice is used for localization. This method helps both the professor and the student get the experience of being in a real classroom.
Figure: Localization of a speaker in a class using audio and video sensors who is projected to the screen on the other end
|Leader Of the Team||Faculty||Student|
|REAL-TIME VIDEO STREAMING AND QOS|
|Video and Audio Processing, Speaker Identification in E-learning|
|Multi Speaker Detection and Tracking using Audio and Video Sensors with Gesture Analysis|
|Multimedia: Video Sharing for E-classroom|