Publication Type:

Conference Paper

Source:

CEUR Workshop Proceedings, CEUR-WS, Volume 1737, p.131-134 (2016)

URL:

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85006093214&partnerID=40&md5=84981f88d3c88d2b137e48a6115fe9c9

Keywords:

Embeddings, Information Retrieval, Mixed-Script, Mother tongues, Multiple languages, Semantics, Social media, Social media platforms, Social networking (online), Subtask, Vector space models, Vector spaces

Abstract:

One of the major challenges nowadays is Information retrieval from social media platforms. Most of the information on these platforms is informal and noisy in nature. It makes the Information retrieval task more challenging. The task is even more difficult for twitter because of its character limitation per tweet. This limitation bounds the user to express himself in condensed set of words. In the context of India, scenario is little more complicated as users prefer to type in their mother tongue but lack of input tools force them to use Roman script with English embeddings. This combination of multiple languages written in the Roman script makes the Information retrieval task even harder. Query processing for such CodeMixed content is a difficult task because query can be in either of the language and it need to be matched with the documents written in any of the language. In this work, we dealt with this problem using Vector Space Models which gave significantly better results than the other participants. The Mean Average Precision (MAP) for our system was 0.0315 which was second best performance for the subtask.

Notes:

cited By 0; Conference of 2016 Forum for Information Retrieval Evaluation, FIRE 2016 ; Conference Date: 7 December 2016 Through 10 December 2016; Conference Code:125007

Cite this Research Publication

S. Singh, Dr. M. Anand Kumar, and Dr. Soman K. P., “CEN@Amrita: Information retrieval on CodeMixed Hindi English tweets using vector space models”, in CEUR Workshop Proceedings, 2016, vol. 1737, pp. 131-134.

207
PROGRAMS
OFFERED
6
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS
  • Amrita on Social Media

  • Contact us

    Amrita Vishwa Vidyapeetham,
    Amritanagar,
    Coimbatore - 641 112,
    Tamil Nadu, India.
    • Fax                 : +91 (422) 268 6274
    • Coimbatore   : +91 (422) 268 5000
    • Amritapuri    : +91 (476) 280 1280
    • Bengaluru     : +91 (080) 251 83700
    • Kochi              : +91 (484) 280 1234
    • Mysuru          : +91 (821) 234 3479
    • Chennai         : +91 (44 ) 276 02165
    • Contact Details »