Publication Type:

Journal Article

Source:

Research Journal of Applied Sciences, Engineering and Technology, Volume 7, Number 6, p.1001-1012 (2014)

URL:

https://www.scopus.com/inward/record.url?eid=2-s2.0-84893379360&partnerID=40&md5=480f2b83253fa10d8268b48bee1b33e7

Abstract:

<p>Creation of Parallel Corpora and efficient corporal alignment at sentential level for structurally distinct languages having relatively low degree of correlation remains a challenge. This work emphasizes the importance of domain biased parallel data collection and a structured methodology to obtain the same for English-Hindi language duet. Further, its sentential alignment has also been undertaken since the participating languages are structurally distinct. In essence two aspects of this study is collection of parallel corpora from different domains and aligning the extracted parallel corpus at sentence level. The proposition is intended to help researchers in the field of Natural Language Processing help contribute better in terms of accuracy, precision and robustness of their proposition. This being possible only with availability of abundant parallel corpora and more so only if the parallel corpora are available domain wise and aligned at least at sentence level. The language pair considered for the development of the algorithm is English-Hindi. The algorithm being generic in nature makes our proposition scalable to other like structured language pairs. © Maxwell Scientific Organization, 2014.</p>

Notes:

cited By 0

Cite this Research Publication

D. Gupta, Raveendran, V., and Yadav, R. K., “Domain biased bilingual parallel data extraction and its sentence level alignment for english-hindi pair”, Research Journal of Applied Sciences, Engineering and Technology, vol. 7, pp. 1001-1012, 2014.

207
PROGRAMS
OFFERED
5
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS