Back close

Recognising the English Language using Context Free Grammar with PyFormlang

Publication Type : Conference Paper

Publisher : IEEE

Source : 2022 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)

Url : https://doi.org/10.1109/conecct55679.2022.9865855

Campus : Bengaluru

School : School of Computing

Year : 2022

Abstract : Natural language recognition is a sub-field of Natural Language Processing (NLP), a popular research playground that lies in the intersection of multiple areas of linguistics, computer science, artificial intelligence and machine learning. The purpose of NLP is to define a computer that can "understand" the contents of documents, including the language's contextual nuances. The first step, however, would be to pre-process and parse a particular statement to check if it is a legitimate sentence of the language or not. This is where Automata theory comes into the picture. Using the Python PyFormlang and nltk libraries, we develop an English language recognizer based on Context free grammar (CFG) representations of the English Language, with parts-of-speech (POS) tags making up the constituencies of the CFG. Syntactically accurate sentences are accepted if parsed without errors, else they are deemed invalid. We also layout the set of productions for the English language, which has proven to work well with most sentences including simple and complex, with an accuracy of 84.90%.

Cite this Research Publication : Harshitha Nagarajan, Punitha Vancha, Supriya M, Recognising the English Language using Context Free Grammar with PyFormlang, 2022 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), IEEE, 2022, https://doi.org/10.1109/conecct55679.2022.9865855

Admissions Apply Now