Start |
End |
Slides |
Required Readings |
Recommended Readings |
Aug 10 | Aug 17 |
Introduction
| J&M Ch 1
|
Recent Advances in NLP
|
Aug 17 | Aug 21 |
Regular Languages and Finite State Automata
| SLP3 Ch 2
|
 
|
Aug 21 | Aug 24 |
Finite State Transducers
| J&M Ch 3
|
 
|
Aug 24 | Aug 31 |
Classical Text Categorization: Naive Bayes, Logistic Regression
|
Notes (Sections 1-4)
SLP3 Ch 4
|
Gender in Job Postings
Useful Things about ML
Performance Measures
|
Aug 28 | Sep 11 |
Assignment 1
| Resources
|
|
Aug 31 | Sep 3 |
Sentiment Mining and Lexicon Learning
|
Survey (Sections 1-4.5)
Tutorial (Sections 1-5)
SLP3 Ch 19
|
Semantic Orientation of Adjectives
Unsupervised Classification of Reviews
|
Sep 7 | Sep 7 |
Vector Spaces in Information Retrieval
|
SLP3 Ch 6.1-6.6
|
LSA and PLSA
Detailed Tutorial on LDA
|
Sep 10 | Sep 10 |
An Intro to Deep Learning for NLP
| Goldberg 2,4,5
|
|
Sep 14 | Sep 30 |
Assignment 2
|
Resources
|
|
Sep 14 | Sep 17 |
Representation Discovery: Word2Vec & GloVe
|
Goldberg 8.1-8.4, 10, 11
SLP3 6.7-6.11
Embeddings vs. Factorization
|
Contextual Embeddings
Trends and Future Directions on Word Embeddings
|
Sep 17 | Sep 17 |
N-gram Features with CNNs
|
Goldberg 13
|
Practitioner's Guide to CNNs
|
Sep 24 | Sep 28 |
RNNs for Variable Length Sequences
|
Goldberg 14.1-14.3.1,14.4-14.5
Goldberg 15, 16.1.1, 16.2.2
Understanding LSTMs
Deriving LSTMs
Pooling in RNNs
|
RNNs and Vanishing Gradients (Section 4.3)
|
Sep 28 | Sep 28 |
Tricks for Training RNNs
| Deep Learning for NLP Best Practices
|
|
Sep 30 | Oct 11 |
Assignment 3
|
|
|
Oct 1 | Oct 1 |
Attention & Transformer
|
Goldberg 17.1, 17.2, 17.4
Attention is All You Need
The Illustrated Transformer
|
Reformer
|
Oct 5 | Oct 5 |
N-Gram Language Models
|
SLP3 Ch 3
Goldberg 9.1-9.3
|
|
Oct 12 | Oct 12 |
Neural & Pre-Trained Language Models
|
Goldberg 9.4-9.5
SLP3 Ch 10
BERT Paper
|
ELMo Paper
GPT2 Paper
|
Oct 12 | Oct 24 |
Assignment 3 (Part B)
|
|
|
Oct 22 | Oct 22 |
Advanced Pre-training for Language Models
|
BART
T5
Pre-training tasks in ERNIE 2.0 (Section 4)
|
XLNet
ALBERT
|
Oct 23 | Oct 23 |
GPT3 & Beyond: Few-Shot Learning, Prompt Learning
|
GPT3
Adapter Tuning
|
GPT3 Explained
Prefix Tuning
|
Oct 23 | Oct 26 |
Multilingual NLP |
Sentence Piece Tokenization
mBART
XLM-R
BLEU
|
Typology of Languages
|
Oct 26 | Nov 12 |
Assignment 4
|
|
|
Oct 26 | Oct 29 |
Neural CRF and Learning with Constraints for Sequence Labeling
| Goldberg 19.1-19.3, 19.4.2
Bidirectional LSTM-CRF Models
Learning with Constraints
|
|
Oct 29 | Nov 10 |
Statistical Natural Language Parsing
|
SLP3 Ch 12,1-12,5, 13.1-13.2, 14.1-14.3,
Lectures Notes on PCFGs
Lectures Notes on Lexicalized PCFGs
|
|
Nov 11 | Nov 11 |
Fairness & Ethics in NLP
|
|
|
Nov 11 | Nov 11 |
Recursive Neural Networks
|
Goldberg 18
|
|
Nov 11 | Nov 11 |
Wrap Up
|
|
|
Textbook and Readings
Yoav Goldberg
Neural Network Methods for Natural Language Processing,
Morgan and Claypool (2017) (required).
Dan Jurafsky and James Martin
Speech and Language Processing, 3nd Edition,
(under development).
Grading
Assignments: 50%; Midterm: 20%;
Final: 30%; Class participation, online discussions: extra credit.
Course Administration and Policies
- Subscribe to the class discussion group on Piazza. (access code: col772)
- All programming assignments are to be done individually.
You may discuss the subject matter with other students in the class,
but all solutions, code, writeups must be your own. In your writeup mention names of any students with whom you discussed the projects.
You are expected to maintain the utmost level of academic integrity in the course.
- Programming assignments may be handed in up to a week late, at a penalty of 10% of the maximum grade per day.
Cheating Vs. Collaborating Guidelines
As adapted from
Dan Weld's guidelines.
Collaboration is a very good thing. On the other hand,
cheating is considered a very serious offense.
Please don't do it! Concern about cheating creates an unpleasant
environment for everyone.
If you cheat, you get a zero in the assignment, and additionally
you risk losing your position as a student in the department and the institute.
The department's policy on cheating is to report any cases to
the disciplinary committee.
What follows afterwards is not fun.
So how do you draw the line between collaboration and cheating?
Here's a reasonable set of ground rules.
Failure to understand and follow these rules will constitute cheating,
and will be dealt with as per institute guidelines.
- The Kyunki Saas Bhi Kabhi Bahu Thi Rule:
This rule says that you are free to meet with fellow students(s)
and discuss assignments with them.
Writing on a board or shared piece of paper is acceptable during the meeting;
however, you should not take any written (electronic or otherwise) record away from the meeting.
This applies when the assignment is supposed to be an individual effort or
whenever two teams discuss common problems they are each encountering
(inter-group collaboration).
After the meeting, engage in a half hour of mind-numbing activity
(like watching an episode of Kyunki Saas Bhi Kabhi Bahu Thi),
before starting to work on the assignment.
This will assure that you are able to reconstruct what you learned from the meeting,
by yourself, using your own brain.
- The Right to Information Rule:
To assure that all collaboration is on the level,
you must always write the name(s) of your collaborators on your assignment.
This also applies when two groups collaborate.