Back close

Hierarchical Language Modeling for Dense Video Captioning

Publication Type : Conference Paper

Publisher : Springer Nature Singapore

Source : Lecture Notes in Networks and Systems

Url : https://doi.org/10.1007/978-981-16-6723-7_32

Campus : Coimbatore

School : School of Computing

Department : Computer Science and Engineering

Year : 2022

Abstract :

The objective of video description or dense video captioning task is to generate a description of the video content. The task consists of identifying and describing distinct temporal segments called events. Existing methods utilize relative context to obtain better sentences. In this paper, we propose a hierarchical captioning model which follows encoder-decoder scheme and consists of two LSTMs for sentence generation. The visual and language information are encoded as context using bi-directional alteration of single-stream temporal action proposal network and is utilized in the next stage to produce coherent and contextually aware sentences. The proposed system is tested on ActivityNet captioning dataset and performed relatively better when compared with other existing approaches.

Cite this Research Publication : Jaivik Dave, S. Padmavathi, Hierarchical Language Modeling for Dense Video Captioning, Lecture Notes in Networks and Systems, Springer Nature Singapore, 2022, https://doi.org/10.1007/978-981-16-6723-7_32

Admissions Apply Now