Publication Type:

Book Chapter

Source:

Advances in Natural Language Processing, Springer Berlin Heidelberg, p.368–379 (2006)

URL:

http://link.springer.com/chapter/10.1007/11816508_38

Abstract:

This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of POS-tagging, lemmatization or stemming, we explore which linguistic information helps improve alignment error rates. For this, evaluation against a human word alignment reference is performed, aiming at an improved machine translation training scheme which eventually leads to improved SMT performance. Experiments are carried out in a Spanish–English European Parliament Proceedings parallel corpus, both in a large and a small data track. As expected, improvements due to introducing morphosyntactic information are bigger in case of data scarcity, but significant improvement is also achieved in a large data task, meaning that certain linguistic knowledge is relevant even in situations of large data availability.

Cite this Research Publication

A. De Gispert, Dr. Deepa Gupta, Popović, M., Lambert, P., Mariño, J. B., Federico, M., Ney, H., and Banchs, R., “Improving statistical word alignments with morpho-syntactic transformations”, in Advances in Natural Language Processing, Springer Berlin Heidelberg, 2006, pp. 368–379.

207
PROGRAMS
OFFERED
5
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
9th
RANK(INDIA):
NIRF 2017
150+
INTERNATIONAL
PARTNERS