Unit I
Computational linguistics- Introduction, syntax, semantics, morphology, collocation and other NLP problems. Word representation: One-hot encoding, Bag-of-Words (BoW) Dictionary: Term Frequency – Inverse Document Frequency (TF-IDF), Embedding: Word2vec, Glove and Fast text.
Unit II
Language Model-n-gram, Sequences and sequential data: Part-of-Speech tagging-HMM and CRF, Named Entity Recognition, Dependency parsing. Evaluation metrics for NLP models- Precision, Recall, F score, ROUGE, BLEU scores and Visualization
Unit III
Machine learning and deep learning for NLP, Sequence to sequence modelling (Encoder decoder), Attention mechanism, Transformer Networks – BERT, A brief introduction to Reinforcement learning for NLP. NLP application introduction- Sentiment Analysis, Machine translation, Question Answering, Text summarization.
