Publication Type : Conference Proceedings
Publisher : Journal of Intelligent Systems
Source : Journal of Intelligent Systems, De Gruyter (2019)
Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85063469828&doi=10.1515%2fjisys-2019-2510&partnerID=40&md5=99f185b0d38d0367536686614ccfd004
Keywords : bidirectional RNN, Computational linguistics, Computer aided language translation, Deep neural networks, Human evaluation, Long short-term memory, LSTM, Machine translations, Natural language processing systems, Network architecture, Parallel corpora
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Computer Science, Electronics and Communication
Year : 2019
Abstract : Introduction of deep neural networks to the machine translation research ameliorated conventional machine translation systems in multiple ways, specifically in terms of translation quality. The ability of deep neural networks to learn a sensible representation of words is one of the major reasons for this improvement. Despite machine translation using deep neural architecture is showing state-of-the-art results in translating European languages, we cannot directly apply these algorithms in Indian languages mainly because of two reasons: unavailability of the good corpus and Indian languages are morphologically rich. In this paper, we propose a neural machine translation (NMT) system for four language pairs: English-Malayalam, English-Hindi, English-Tamil, and English-Punjabi. We also collected sentences from different sources and cleaned them to make four parallel corpora for each of the language pairs, and then used them to model the translation system. The encoder network in the NMT architecture was designed with long short-term memory (LSTM) networks and bi-directional recurrent neural networks (Bi-RNN). Evaluation of the obtained models was performed both automatically and manually. For automatic evaluation, the bilingual evaluation understudy (BLEU) score was used, and for manual evaluation, three metrics such as adequacy, fluency, and overall ranking were used. Analysis of the results showed the presence of lengthy sentences in English-Malayalam, and the English-Hindi corpus affected the translation. Attention mechanism was employed with a view to addressing the problem of translating lengthy sentences (sentences contain more than 50 words), and the system was able to perceive long-term contexts in the sentences. ©2019 Walter de Gruyter GmbH, Berlin/Boston 2019.
Cite this Research Publication : B. Premjith, Anand Kumar M., and Dr. Soman K. P., “Neural Machine Translation System for English to Indian Language Translation Using MTIL Parallel Corpus: Special Issue on Natural Language Processing”, Journal of Intelligent Systems, 2019.