Back close

Syllabus

UNIT 1

Python Programming and Text Processing Foundations

Python string manipulation, regular expressions, text cleaning, tokenisation, file handling, CSV and JSON processing, PDF text extraction, API handling, and Python libraries including re, json, requests, pdfplumber, and pandas.

UNIT 2

Natural Language Processing and MongoDB for Text Understanding

Stopword removal, stemming, lemmatisation, Named Entity Recognition (NER), keyword extraction, sentiment analysis, text vectorisation, document preprocessing, MongoDB collections, CRUD operations, querying, and MongoDB integration with Python.

UNIT 3

Text Generation and Summarisation for Multilingual Systems

Pretrained language models, extractive and abstractive summarisation, multilingual translation, response generation, content creation and Python libraries including transformers, deep-translator, and torch.

UNIT 4

Semantic Search, Retrieval-Augmented Generation (RAG), and Intelligent Document Systems

Web text extraction, semantic embeddings, cosine similarity, vector databases, document indexing, semantic search pipelines, FAQ systems, and Python libraries including sentence-transformers, langchain, chromadb, and faiss.

Objectives and Outcomes

Nature of Course

Theory & Lab

Course Objectives

  • The course covers the practical application of Python programming in scientific research, with a focus on developing effective data analysis and literature synthesis skills for exploring Indian textbooks – IKS.
  • The subject provides an overview of computational methodologies and their relationship between data analysis, traditional philosophical frameworks and scholarly communication.
  • The course focuses on the application level of programming tools and how they enhance research efficiency, critical thinking, and advanced academic analysis of ancient texts and knowledge systems – IKS.

Course Outcomes

After successful completion of the course, Students will be able to:

CO Course Outcomes
CO1 Apply Python and NLP techniques to process and analyse Indian Knowledge Systems (IKS) texts and multilingual cultural data.
CO2 Develop intelligent text-processing applications for extracting, organising and retrieving knowledge from Indian traditional sources.
CO3 Build multilingual AI systems for translation, summarisation and content generation related to Indian knowledge traditions.
CO4 Design semantic search and RAG-based document systems for efficient access to Indian philosophical, cultural, and heritage knowledge bases.

POs Programme Outcomes

  • PO1: Engineering Knowledge
  • PO2: Problem Analysis
  • PO3: Design/Development of Solutions
  • PO4: Conduct Investigations of complex problems
  • PO5: Modern tools usage
  • PO6: Engineer and Society
  • PO7: Environment and Sustainability
  • PO8: Ethics
  • PO9: Individual & Teamwork
  • PO10: Communication
  • PO11: Project management & Finance
  • PO12: Lifelong learning

CO-PO Mapping
[affinity#: 3 – high; 2- moderate; 1- slightly]

COs PO1 PO2 PO3 PO4 PO5 PO6 PO7
CO01 3 2 2 2 1 1
CO02 3 2 2 2 2
CO03 2 2 3 2 2 2
CO04 3 3 2 2 1

Lecture and Lab Hours

Lecture and Lab Hours

Topics

Subtopics

CO

PO

1–5

Data Structures and String Processing

Lists, tuples, dictionaries, string manipulation

CO1

PO1

6–10

Regular Expressions and File Processing

Regular expressions, text cleaning, tokenisation, file handling

CO1

PO2

11–15

Document and API Processing

CSV, JSON, PDF text extraction, API handling

CO1

PO4

16–18

Python Libraries for Text Processing

re, json, requests, pdfplumber, pandas

CO1

PO4

19–21

NLP Fundamentals

Tokenisation, stopword removal, stemming, lemmatisation

CO2

PO1

22–24

Information Extraction Techniques

Named Entity Recognition, keyword extraction, sentiment analysis

CO2

PO3

MIDTERM EXAMINATION

25–27

Text Vectorisation and MongoDB

Text vectorisation, document preprocessing, MongoDB collections

CO2

PO4

28–30

MongoDB Operations with Python

CRUD operations, querying, JSON handling, MongoDB integration with Python

CO2

PO2

31–33

Text Generation and Translation

Prompt engineering, pretrained language models, summarisation, multilingual translation

CO3

PO1

34–37

Transformer Libraries

transformers, deep-translator, torch, pandas

CO3

PO4

38–39

Semantic Search and Retrieval

Embeddings, cosine similarity, vector databases, document indexing

CO4

PO2

40–41

RAG and Intelligent Document Systems

Semantic search pipelines, chatbot systems, FAQ systems, langchain, chromadb, faiss

CO4

PO4

END SEMESTER EXAM

Evaluation Pattern

Course Category

L-T-P

Internal: External

Internal (%)

External (%)

Mid-Term (%)

Continuous Evaluation – Theory (%)

Continuous Evaluation – Lab (%)

Theory with Lab Component

2-0-2

70 : 30

70

30

20

10

40

Total: 100

  • Continuous Assessment Theory: 10% – Assignments and class participation
  • Mid-Term Examination: 20% – Lab examination covering theory topics
  • Continuous Assessment: Lab: 40% – Project and practical performance
  • End Semester Theory Examination: 30% – Lab examination covering the complete syllabus.

Faculty Information

Name: Dr. Sooraj Rajendran
Designation: Assistant Professor
Email: soorajrajendran@am.amrita.edu

Reference Books

  • Sweigart, A. (2019). Automate the boring stuff with Python: Practical programming for total beginners (2nd ed.). No Starch Press.
  • Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. O’Reilly Media.
  • Jurafsky, D., & Martin, J. H. (2025). Speech and language processing (3rd ed. draft). Stanford University.
  • Banker, K. (2011). MongoDB in action. Manning Publications.
  • Chodorow, K. (2019). MongoDB: The definitive guide (3rd ed.). O’Reilly Media.

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now