Publication Type : Conference Paper
Publisher : IEEE
Source : Scopus
Url : https://doi.org/10.1109/ICITEICS61368.2024.10625333
Keywords : Audition; Deep neural networks; Caption generation; CSMN; Google+; Google-TTS; Inception v3; LSTM; Neural-networks; Text to speech; Visual impairment; Visually impaired
Campus : Bengaluru
School : School of Computing
Department : Computer Science and Engineering
Year : 2024
Abstract : The integration of deep learning and computer vision has led to groundbreaking innovations, notably in Image-to-Audio technologies. This transformative paradigm aims to provide comprehensible audio descriptions for visual content, offering increased independence and inclusivity for individuals with visual impairments. Rooted in intricate computational methodologies, this approach seamlessly translates visual data into auditory representations. The paper explores the technical aspects of Image-to-Audio conversion, emphasizing its potential applications in digital media, education, and everyday environments. Technology is seen as a societal catalyst, empowering individuals with visual impairments and fostering newfound self-sufficiency and autonomy.
Cite this Research Publication : Gogineni Ashrith Sai, Kondareddy Balaji Reddy, Kothuru Gurunadh, Thanakanti Ganesh Madhav, S. Santhanalakshmi, Visio-Voice Transforming Images into Sound for the Visually Impaired, Scopus, IEEE, 2024, https://doi.org/10.1109/ICITEICS61368.2024.10625333