Publication Type : Conference Paper
Publisher : Springer Nature
Source : In 5th international conference on micro-electronics and telecommunication engineering, Springer book series on “Lecture Notes in Networks and Systems.
Url : https://link.springer.com/chapter/10.1007/978-981-16-8721-1_22
Campus : Chennai
School : School of Computing
Year : 2021
Abstract : Topic modeling is the statistical model for discovering hidden topics or keywords in a collection of documents. Topic modeling is also considered a probabilistic model for learning, analyzing, and discovering topics from the document collection. The most popular techniques for topic modeling are latent semantic analysis (LSA), probabilistic latent semantic analysis (pLSA), latent Dirichlet allocation (LDA), and the recent deep learning-based lda2vec. LDA is most commonly used in extractive multi-document summarization to determine whether the extracted sentence reflects the concept of the input document. In this paper, we will try to explore various multi-document summarization techniques that use LDA as a topic modeling method for improving final summary coverage and to reduce redundancy. Finally, we compared LDA and LSA using the Genism toolkit, and our experiment results show that LDA outperforms LSA if we increase the number of features considered for sentence selection.
Cite this Research Publication : Bharathi Mohan, G. and Prasanna Kumar, R., 2021. "A comprehensive survey on topic modeling in text summarization". In 5th international conference on micro-electronics and telecommunication engineering, Springer book series on “Lecture Notes in Networks and Systems.