B. Tech. (4th Year)
(Common for CSE with specialization in Data Science, CSE with specialization in Cloud Technology & Information Security)
BCSE-560 (Natural Language Processing - NLP)
Course Objectives:
To familiarize the fundamental concepts and techniques of tual language processing (NLP).
Develop an in-depth understanding of the computational properties of natural languages and the commonly used algorithms for processing linguistic information.
Understand basic processes and representations used n syntax, semantics and other components of natural language processing
To relate mathematical foundations, Probability theory with Linguist essentials such as syriactic and semantic analysis of the text
To extract information from text automatically using concepts and methods from natura language processing.
Unit 1:
Python Text and NLP Basics: Introduction to Python Text Basics, Working with Text Files with Python, Working with PDFs, Introduction to Regular Expressions. Finding Pattens in Tex, Substituting Patterns in Tex, Shorthand Character Classes, Character Ranges — Tex, Preprocessing using Regex. Introduction to Natural Language Processing, NLP applications, the challenge of variety and ambiguity of language, the role of machine (deep) learning in NLP, Spacy Basics, Tokenization, Stemming, Lemmatization, Stop Words, Phrase Matching, and Vocabulary
Unit 2:
Part of Speech Tagging and Named Entity Recognition: Introduction to POS and NER, Parts Of Speech Tagging, POS Tag Meanings, Named Entity Recognition, Sentence Segmentation, Text Modelling using the Bag of Words Model, Building the BOW Model, Text Modelling using the TF- IDF Model, Building the TF-IDF Model, Understanding the N-Gram Model, Building Character N-Gram Model, Building Word N-Gram Model, Understanding Latent Semantic Analysis, LSA in Python, Word Synonyms and Antonyms using NLTK, Word Negation Tracking in Python
Unit 3:
Text Classification and Text Summarization: Getting the data for Text Classification, Importing the dataset, Persisting the dataset, Preprocessing the data, transforming data into BOW Model, Transform BOW model into TF-IDF Model, creating tainting and test set Understanding Logistic Regression, Training classifier, Testing Model performance, Saving and Importing Model. Understanding Text Summarization, fetching article data from the web, Parsing the data using Beautiful Soup, Preprocessing the data, Tokenizing Article into sentences, Building the histogram, Calculating the sentence scores
Unit 4:
Semantics and Sentiment Analysis: Introduction to Semantics and Sentiment Analysis, Overview of Semantics and Word Vectors, Semantics and Word Vectors with Spacy, Sentiment Analysis Overview, Sentiment Analysis with NLTK, Sentiment Analysis Movie Review Project Twitter Sentiment Analysis: Setting up the Twitter Application, Initializing Tokens, Client Authentication, Fetching real-time tweets, Loading TF-IDF Model and Classifier, Preprocessing the tweets, Predicting sentiments of tweets, Plotting the results.
Course Outcome:
After completion of his course, students will
- Be able to understand & explain concepts and tasks in natural language processing.
- Gain a foundational understanding of the methods and evaluation metrics for various natural language processing tasks.
- Gain practical experience n the NLP toolkits avaiable
- Gain basic skils for conducting NLP research, including reading and analyzing research papers, analyzing results, and how to improve the approaches
- Analyse the syntax, semantics and pragmatics of a statement written in natural language processing.
Text Books:
1." Christopher D. Manning and Hinvich Schutze, Foundations of Natural Language Processing" , 6% Edition, The MIT Press Cambridge, Massachusetts London, England, 2003
2. Practical Natural Language Processing by Sowmya Vajala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana, Juy 2020
3. Natural Language Processing wih Python’ Analyzing Text wth the Natural Language Took (Greyscale India Eton) by Steven Bird, Ewan Klein, Edward Loper, January 2011.
4. Dwyght Gunning, S.G. (2019), Natural Language Processing Fundamentals: Build Ineligent Applications that Can Interpret the Human Language to Delrver Impactil Resulls USA Pack! publishing
Reference Books:
1. Sue Knight, "NLP-at-Work-The-Essence-of Excellence’, 3¢ Editon (People Skis for Professionals)
2 Uday Kamath, John Li, James Whitaker Deep Leaning for NLP and Speech Recognitor, published on august 14, 2020
3. Jacob Eisenstein, “introduction to NAwral Language Processing’, Pubiished on October 1, 2019.