CSL772: Natural Language Processing -- Autumn 2014

Start	End	Slides	Required Readings	Recommended Readings
Jul 24	Jul 31	Introduction	J&M Ch 1
Jul 31	Jul 31	Regular Languages and Finite State Automata	J&M Ch 2
Jul 31	Aug 11	Morphology with Finite State Transducers	J&M Ch 3
Aug 4	Aug 11	Text Categorization using Naive Bayes	Notes (Sections 1-4) SLP3 (Upto Section 6.3)	Performance Measures Error Correcting Output Codes
Aug 8	Aug 19	Assignment 1		Raw Data (Ver 2), Class List Format Checker, F-Score Calculator
Aug 21	Aug 25	Sentiment Mining	Survey (Sections 1-4.5) Tutorial (Sections 1-5)	Semantic Orientation of Adjectives Unsupervised Classification of Reviews
Aug 25	Sep 4	Log Linear Models	Notes (Section 2) SLP3 (Section 6.4)	Max Entropy models for WSD
Aug 25	Sep 16	Assignment 2
Sep 2	Sep 23	Project (Part 1)
Sep 4	Sep 8	Language Models	J&M Ch 4 SLP3	Empirical Comparison of Smoothing Techniques
Sep 11	Sep 11	POS Tagging with Hidden Markov Models	J&M Ch 6.1-6.5 SLP3 (Ch 7, 8.1-8.4)
Sep 15	Sep 18	Named Entity Recognition with MEMMs	J&M Ch 6.7-6.8, 13.5 MEMMs (Section 8.5)	Non-Local Features and Knowledge in NER Unsupervised Person Name Recognition
Sep 18	Sep 18	Brown Clustering	Thesis (Chapter 4)	Brown Clustering for NER
Sep 22	Sep 22	Conditional Random Fields for NER and POS Tagging	Notes (Section 4) Detailed Notes	Label Bias (Section 2) Twitter NER (Sections 1-3.1)
Sep 22	Sep 25	Information Extraction	J&M Section 22.2 Background Knowledge in IE Distant Supervision in IE	MultiR
Sep 29	Sep 29	Guest Lecture by L V Subramaniam
Oct 13	Oct 13	Open Information Extraction	ReVerb OLLIE
Oct 15	Nov 04	Assignment 3
Oct 16	Oct 16	Document Similarity in Information Retrieval	IR Textbook (Chapters 2,6.2-6.3) LSA and PLSA
Nov 10	Nov 13	Statistical Natural Language Parsing	J&M Ch 12, 14 Lectures Notes on PCFGs Lectures Notes on Lexicalized PCFGs	Latent Variable models for Parsing
Nov 17	Nov 17	Other NLP Tasks and Discussion	Noam Chomsky on ML Noam Chomsky on AI Peter Norvig on Chomsky	Coherent Multi-Document Summarization

Textbook and Readings

Dan Jurafsky and James Martin, Speech and Language Processing, 2nd Edition,
Prentice-Hall (2008) (required).

Dan Jurafsky and James Martin Speech and Language Processing, 3nd Edition,
(under development).

Grading

Assignments: 30%; Project: 20%; Minor (each): 10%; Final: 30%; Class participation, online discussions: extra credit.

Course Administration and Policies

Subscribe to the class discussion group on Piazza. (access code: csl772)
All programming assignments are to be done individually. You may discuss the subject matter with other students in the class, but all solutions, code, writeups must be your own. In your writeup mention names of any students with whom you discussed the projects. You are expected to maintain the utmost level of academic integrity in the course.
Programming assignments may be handed in up to a week late, at a penalty of 10% of the maximum grade per day.
The project is to be done in a group of two. You may take special written permission in case you wish to do a project in group of any other size (even one). Except for unusual circumstances, all team members will get the same grade.
There is no late policy for the project submission. Project needs to be submitted by the deadline.

Cheating Vs. Collaborating Guidelines

As adapted from Dan Weld's guidelines.

Collaboration is a very good thing. On the other hand, cheating is considered a very serious offense. Please don't do it! Concern about cheating creates an unpleasant environment for everyone. If you cheat, you get a zero in the assignment, and additionally you risk losing your position as a student in the department and the institute. The department's policy on cheating is to report any cases to the disciplinary committee. What follows afterwards is not fun.

So how do you draw the line between collaboration and cheating? Here's a reasonable set of ground rules. Failure to understand and follow these rules will constitute cheating, and will be dealt with as per institute guidelines.

The Kyunki Saas Bhi Kabhi Bahu Thi Rule: This rule says that you are free to meet with fellow students(s) and discuss assignments with them. Writing on a board or shared piece of paper is acceptable during the meeting; however, you should not take any written (electronic or otherwise) record away from the meeting. This applies when the assignment is supposed to be an individual effort or whenever two teams discuss common problems they are each encountering (inter-group collaboration). After the meeting, engage in a half hour of mind-numbing activity (like watching an episode of Kyunki Saas Bhi Kabhi Bahu Thi), before starting to work on the assignment. This will assure that you are able to reconstruct what you learned from the meeting, by yourself, using your own brain.

The Right to Information Rule: To assure that all collaboration is on the level, you must always write the name(s) of your collaborators on your assignment. This also applies when two groups collaborate.

Course Contents

Schedule

Textbook and Readings

Grading

Course Administration and Policies

Cheating Vs. Collaborating Guidelines