Start |
End |
Slides |
Required Readings |
Recommended Readings |
Jan 2 | Jan 12 |
Introduction |
J&M Ch 1
|
History of NLP
|
Jan 12 | Jan 12 |
Regular Languages and Finite State Automata |
SLP3 Ch 2
|
Regular Expressions Demo
|
Jan 16 | Jan 16 |
Finite State Transducers |
J&M Ch 3
|
 
|
Jan 16 | Jan 23 |
Classical Text Categorization: Naive Bayes, Logistic Regression |
Notes (Sections 1-4)
SLP3 Ch 4
SLP3 Ch 5
|
Gender in Job Postings
Useful Things about ML
Performance Measures
|
Jan 23 | Jan 23 |
Sentiment Mining and Lexicon Learning |
Survey (Sections 1-4.5)
Tutorial (Sections 1-5)
SLP3 Ch 25
|
Semantic Orientation of Adjectives
Unsupervised Classification of Reviews
|
Jan 24 | Jan 24 |
Vector Spaces in Information Retrieval |
SLP3 Ch 6.1-6.6
|
LSA and PLSA
Detailed Tutorial on LDA
|
Jan 24 | Jan 24 |
An Intro to Deep Learning for NLP |
Goldberg 2,4,5
|
|
Jan 29 | Feb 12 |
Assignment 1 |
Resources
|
|
Jan 30 | Jan 30 |
A Beginner's Introduction to Large Language Models |
|
|
Feb 2 | Feb 9 |
Representation Discovery: Word2Vec & GloVe
|
Goldberg 8.1-8.4, 10, 11
SLP3 6.8-6.12
Embeddings vs. Factorization
|
Contextual Embeddings
Trends and Future Directions on Word Embeddings
|
Feb 9 | Feb 9 |
N-gram Features with CNNs
|
Goldberg 13
|
Practitioner's Guide to CNNs
|
Feb 13 | Feb 14 |
RNNs for Variable Length Sequences
|
Goldberg 14.1-14.3.1,14.4-14.5
Goldberg 15, 16.1.1, 16.2.2
SLP3 9.1, 9.3-9.5
Understanding LSTMs
Deriving LSTMs
|
Pooling in RNNs
RNNs and Vanishing Gradients (Section 4.3)
|
Feb 14 | Feb 16 |
Attention & Transformer
|
Goldberg 17.1, 17.2, 17.4
SLP3 10.1-10.5
Attention is All You Need
The Illustrated Transformer
|
The Annotated Transformer
The BackStory of Transformer
|
Feb 24 | Mar 18 |
Assignment 2
|
Resources
|
|
Mar 4 | Mar 4 |
Neural CRF for Sequence Labeling
| Goldberg 19.1-19.3, 19.4.2
Bidirectional LSTM-CRF Models
|
|
Mar 5 | Mar 5 |
Introduction to N-Gram Language Models |
SLP3 Ch 3.1-3.4
Goldberg 9.1-9.3
|
SLP3 Ch 3.5-3.9
|
Mar 5 | Mar 12 |
Neural Language Models
|
Goldberg 9.4-9.5
SLP3 9.2, 9.6-9.8
SLP3 10.6
|
|
Mar 12 | Mar 12 |
Tricks for Training Neural Models |
Deep Learning for NLP Best Practices
|
|
Mar 15 | Mar 19 |
Pre-Trained Language Models: BERT to GPT4
|
SLP3 10.9
SLP3 11.1, 11.2
BERT Paper
BART
T5
GPT3
|
|
Mar 19 | Mar 22 |
Instruction Fine-Tuning & RLHF Guest Lecture: Pranjal Aggarwal
|
InstructGPT
RLHF Blog
Illustrating RLHF
|
|
Mar 20 | Apr 4 |
Assignment 2.1
| Starter Code
|
|
Apr 1 | Apr 25 |
Assignment 3
|
|
|
Apr 2 | Apr 2 |
Efficient Training & Training at Scale for LLMs Guest Lecture: Dinesh Raghu & Gaurav Pandey
|
Adapter Tuning
LoRA Fine Tuning
DeepSpeed Zero
|
Intrinsic Dimensionality
Mixed Precision Training
|
Apr 5 | Apr 5 |
Efficient Inference for LLMs Guest Lecture: Vishal Saley
|
A Gentle Introduction to 8-bit Matrix Multiplication
Pruning basics - Revisiting Loss Modelling for Unstructured Pruning
SparseGPT
Wanda
FlashAttention
|
LLM.Int8()
Optimal Brain Surgeon
|
Apr 9 | Apr 9 |
Natural Language Generation
|
Token Selection Strategies
Speculative Sampling
FUDGE: Controlled Text Generation
|
The Curious Case of Neural Text Degeneration
|
Apr 10 | Apr 10 |
Retrieval Augmentation in LLMs Guest Lecture: Yatin Nandwani
|
Vector Databases
Retrieval-augmented Generation
FIASS Vector Index
Fintuning RAG Architecture
|
RAG Details
|
Apr 12 | Apr 12 |
Ethical Issues in NLP Guest Lecture: Danish Pruthi
|
Survey on Bias in NLP
|
Debiasing Techniques
|
Apr 16 | Apr 16 |
Reasoning with LLMs Guest Lecture: Chinmay Mittal
|
Prompting Techniques
Logic-LM
FunSearch
AlphaGeometry
|
|
Apr 23 | Apr 27 |
Statistical Natural Language Parsing
|
SLP3 Ch 17.1-17.6, Appendix C
Lectures Notes on PCFGs
Lectures Notes on Lexicalized PCFGs
|
|
Textbook and Readings
Yoav Goldberg
Neural Network Methods for Natural Language Processing,
Morgan and Claypool (2017) (required).
Dan Jurafsky and James Martin
Speech and Language Processing, 3nd Edition,
(under development).
Grading
Assignments: 45%; Quiz: 10%; Midterm: 20%;
Final: 25%; Class participation: extra credit.
Course Administration and Policies
- Subscribe to the class discussion group on Piazza. (access code: col772)
- All programming assignments are to be done individually.
You may discuss the subject matter with other students in the class,
but all solutions, code, writeups must be your own. In your writeup mention names of any students with whom you discussed the projects.
You are expected to maintain the utmost level of academic integrity in the course.
- Programming assignments may be handed in up to a week late, at a penalty of 10% of the maximum grade per day.
Cheating Vs. Collaborating Guidelines
As adapted from
Dan Weld's guidelines.
Collaboration is a very good thing. On the other hand,
cheating is considered a very serious offense.
Please don't do it! Concern about cheating creates an unpleasant
environment for everyone.
If you cheat, you get a zero in the assignment, and additionally
you risk losing your position as a student in the department and the institute.
The department's policy on cheating is to report any cases to
the disciplinary committee.
What follows afterwards is not fun.
So how do you draw the line between collaboration and cheating?
Here's a reasonable set of ground rules.
Failure to understand and follow these rules will constitute cheating,
and will be dealt with as per institute guidelines.
- The Kyunki Saas Bhi Kabhi Bahu Thi Rule:
This rule says that you are free to meet with fellow students(s)
and discuss assignments with them.
Writing on a board or shared piece of paper is acceptable during the meeting;
however, you should not take any written (electronic or otherwise) record away from the meeting.
This applies when the assignment is supposed to be an individual effort or
whenever two teams discuss common problems they are each encountering
(inter-group collaboration).
After the meeting, engage in a half hour of mind-numbing activity
(like watching an episode of Kyunki Saas Bhi Kabhi Bahu Thi),
before starting to work on the assignment.
This will assure that you are able to reconstruct what you learned from the meeting,
by yourself, using your own brain.
- The Right to Information Rule:
To assure that all collaboration is on the level,
you must always write the name(s) of your collaborators on your assignment.
This also applies when two groups collaborate.