Back close

Generating Audio from Lip Movements Visual Input: A Survey

Publication Type : Book Chapter

Publisher : Springer, Singapore

Source : Springer, Singapore

Campus : Amritapuri

School : School of Engineering

Center : Research & Projects

Department : Computer Science

Verified : Yes

Year : 2021

Abstract : Generating audio from visual scene is an extremely challenging yet useful task as it finds application in remote surveillance, comprehending speech for hearing impaired people, or in silent speech interface (SSI). Due to the recent advancements of deep neural network techniques, there have been considerable research effort toward speech reconstruction from silent videos or visual speech. In this survey paper, we review several recent papers in this area and make a comparative study in terms of their architectural models and accuracy achieved.

Admissions Apply Now