| Start |
End |
Slides |
Required Readings |
Recommended Readings |
| Jan 5 | Jan 8 |
Introduction |
History of NLP
|
Noam Chomsky on ML
Noam Chomsky on AI
Peter Norvig on Chomsky
|
| Jan 8 | Jan 12 |
NLP Tasks |
NLP Tasks
Phases of NLP
|
|
| Jan 12 | Jan 15 |
Text Categorization using Classical ML |
Notes (Sections 1-4)
SLP3 Ch 4
|
Gender in Job Postings
Useful Things about ML
Performance Measures
|
| Jan 15 | Feb 2 |
Lexical Semantics with Word2Vec & GloVe
|
SLP3 Ch5
Goldberg Ch8
Embeddings vs. Factorization
|
Issues with Word Embeddings
Equivalence of Embedding & Factorization
|
| Jan 19 | Feb 9 |
Assignment 1 |
Resources
|
|
| Feb 2 | Feb 5 |
Sentiment Mining with CNNs
|
Survey (Sections 1-4.5)
Goldberg Ch13
|
Practitioner's Guide to CNNs
|
| Feb 5 | Feb 9 |
Sequence Labeling with RNNs
|
SLP3 Ch17.1-17.3
SLP3 Ch13.1, Ch13.3-13.6
Understanding LSTMs
Deriving LSTMs
|
Pooling in RNNs
RNNs and Vanishing Gradients (Section 4.3)
|
| Feb 12 | Feb 16 |
Attention & Transformer Encoder
|
SLP3 Ch8.1-8.4
Chakraborty Ch 6.1, 6.2, 6.4
The Illustrated Transformer
|
Contextual Embeddings
Attention is All You Need
The Annotated Transformer
The BackStory of Transformer
|
| Feb 16 | Mar 9 |
N-Gram & Neural Language Models |
SLP3 Ch 3.1-3.5,
8.5,
13.2
Chakraborty Ch 4.1, 4.3, 5.2, 6.3
|
SLP3 Ch 3.6-3.7
|
| Feb 17 | Mar 23 |
Assignment 2 |
|
|
| Feb 19 | Mar 9 |
Dissecting Transformers: Tokenizers Guest Lecture: Kausik Hira
|
Chakraborty Ch 2.4
SLP3 Ch 2.4
SentencePiece
|
|
| Mar 9 | Mar 16 |
Dissecting Transformers: Variants of Attention |
Chakraborty Ch 6.5, Sparse Attention
Approximate Attention: LinFormer, Performer
Flash Attention, KV Caching, MQA & GQA
Multi-Head Latent Attention
|
Gated Attention
|
| Mar 14 | Mar 14 |
Dissecting Transformers: Positional Embeddings Guest Lecture: Sudipto Ghosh
|
Chakraborty Ch 6.4
Relative Position Embeddings
|
|
| Mar 14 | Mar 19 |
Dissecting Transformers: Other Components |
Pre/Post Norm
Learning Rate Scheduling
|
Attention Residuals
|
| Mar 19 | Apr 23 |
Pre-Training: BERT to Qwen
|
SLP3 7.1-7.3, 7.5
Chakraborty 7
BERT Paper
BART
T5
GPT3
|
|
| Mar 23 | Mar 23 |
Parameter Efficient Fine-Tuning Guest Lecture: Vishal Saley
|
LoRA Fine Tuning
LoRA Details
Gradient Accumulation & Checkpointing
|
Intrinsic Dimensionality
|
| Mar 26 | Apr 10 |
Assignment 3 |
|
|
| Mar 30 | Apr 2 |
Natural Language Generation & Decoding Algorithms
|
Penalties
Token Selection Strategies
Speculative Sampling
FUDGE: Controlled Text Generation
|
The Curious Case of Neural Text Degeneration
Medusa
|
| |
Prompt Engineering Self-study
|
Prompting Guide
|
|
| Apr 6 | Apr 6 |
Distillation for LLMs Guest Lecture: Vishal Saley
|
On Policy Distillation
MiniLLM
|
The Magic of LLM Distillation
|
| Apr 9 | Apr 12 |
Alignment using Instruction Tuning and RLHF
|
FOLLM (Section 4-4.3)
InstructGPT
RLHF Blog
Illustrating RLHF
PPO
|
PPO Explained
|
| Apr 10 | Apr 24 |
Assignment 4
|
|
|
| Apr 13 | Apr 23 |
Reasoning using RLVR
|
GRPO
FunSearch
AlphaGeometry
|
Training GRPO at Scale
|
| Apr 16 | Apr 16 |
Mixture of Experts
|
Switch Transformer (Sections 1-4, 7)
DeepSeekMoE
|
|
| Apr 20 | Apr 20 |
Agents
|
DSPy
|
|
| Apr 27 | Apr 27 |
Question Answering & Retrieval Augmented Generation
|
Vector Databases
Retrieval-augmented Generation
FAISS Vector Index
|
RAG Details
|
| Apr 27 | Apr 27 |
Wrap Up
|
|
|
Textbook and Readings
Dan Jurafsky and James Martin
Speech and Language Processing, 3nd Edition,
(required).
Tanmoy Chakraborty
Introduction to Large Language Models,
Wiley (2025) (required).
Yoav Goldberg
Neural Network Methods for Natural Language Processing,
Morgan and Claypool (2017).
Sebastian Raschka
Build a Large Language Model from Scratch,
Manning (2024).
Grading
Assignments: 30%; Quiz: 15%; Midterm Assignment: 20%;
Final: 35%; Class participation: extra credit.
Course Administration and Policies
- Subscribe to the class discussion group on Piazza. (access code: col772)
- All programming assignments are to be done individually.
You may discuss the subject matter with other students in the class,
but all solutions, code, writeups must be your own. In your writeup mention names of any students with whom you discussed the projects.
You are expected to maintain the utmost level of academic integrity in the course.
- Programming assignments may be handed in up to a week late, at a penalty of 10% of the maximum grade per day.
Cheating Vs. Collaborating Guidelines
As adapted from
Dan Weld's guidelines.
Collaboration is a very good thing. On the other hand,
cheating is considered a very serious offense.
Please don't do it! Concern about cheating creates an unpleasant
environment for everyone.
If you cheat, you get a zero in the assignment, and additionally
you risk losing your position as a student in the department and the institute.
The department's policy on cheating is to report any cases to
the disciplinary committee.
What follows afterwards is not fun.
So how do you draw the line between collaboration and cheating?
Here's a reasonable set of ground rules.
Failure to understand and follow these rules will constitute cheating,
and will be dealt with as per institute guidelines.
- The Kyunki Saas Bhi Kabhi Bahu Thi Rule:
This rule says that you are free to meet with fellow students(s)
and discuss assignments with them.
Writing on a board or shared piece of paper is acceptable during the meeting;
however, you should not take any written (electronic or otherwise) record away from the meeting.
This applies when the assignment is supposed to be an individual effort or
whenever two teams discuss common problems they are each encountering
(inter-group collaboration).
After the meeting, engage in a half hour of mind-numbing activity
(like watching an episode of Kyunki Saas Bhi Kabhi Bahu Thi),
before starting to work on the assignment.
This will assure that you are able to reconstruct what you learned from the meeting,
by yourself, using your own brain.
- The Right to Information Rule:
To assure that all collaboration is on the level,
you must always write the name(s) of your collaborators on your assignment.
This also applies when two groups collaborate.