CS 904: Special Topics in Artificial Intelligence
Natural Language Processing
January-April 2002
Instructor: Dr.
L. Venkata Subramaniam,
IBM India Research Lab, Delhi.
Description: In this course we will introduce statistical techniques for inferring
structure from text.
The aim of the course is to introduce existing techniques in statistical NLP and to stimulate
thought into bettering these. We will look in detail at the application of NLP in
Information Retrieval, Information Extraction,
Dialog based Transaction Systems and Machine Translation.
Prerequisites:
- Programming skills (any favourite language will do).
- Basic maturity in mathematics.
- We will build on all key concepts in class.
Text:
Supplementary Reading:
Grading:
- Minor I & II: 30
- Major: 20
- Homework: 30
- Project: 20
Class Details:
- Class Location: CSE Seminar Room.
- Class Timings: Tu and Th 5.00 PM - 6.30 PM.
Homework Policy:
Assignments will typically be handed out in the first class of the week and will be
due
in the first class of the following week. Solutions will be posted within two weeks. In
parallel longer duration projects will also be handed out. Late assignments will not be
graded. Innovative and good ideas presented in solutions to assignments will be rewarded
with bonus marks. The homework is to be done individually. If any help is needed please
consult only the instructor.
Class Notes:
- Jan. 02, 2002............Course Information
[ppt]
- Jan. 02, 2002............Introduction
[ppt]
- Jan. 04, 2002............Mathematical Foundations
[ppt]
slides by Barbara Rosario (UC Berkeley) [reproduced with author's permission].
- Jan. 10, 2002............Linguistic Essentials
[ppt]
- Jan. 10, 2002............Statistics and Linguistics
[ppt]
- Jan. 10, 2002............Corpus-Based Work
[ppt]
- Jan. 15, 2002............Collocations
[ppt]
- Jan. 17, 2002............Statistical Inference: n-grams
[ppt]
- Jan. 22, 2002............n-grams in Speech Recognition
[ppt]
and in Machine Translation
[ppt]
- Jan. 24, 2002............Wordsense Disambiguation
[ppt]
- Jan. 31, 2002............Web Search Engines
[ppt]
- Feb. 07, 2002............Lexical Acquisition
[pdf]
slides by Manning and Schütze [link to their site].
- Feb. 15, 2002............Latent Semantic Analysis
[ppt]
Guest Lecture by Dharmendra P. Kanejiya (IIT Delhi).
- Feb. 21, 2002............Applications of Latent Semantic Analysis
[ppt]
Guest Lecture by Dharmendra P. Kanejiya (IIT Delhi).
- Feb. 26, 2002............Hidden Markov Models
[ppt]
slides by David Blei (UC Berkeley) [reproduced with author's permission].
- Feb. 28, 2002............Parts-of-Speech Tagging
[ppt]
- Mar. 05, 2002...........Probabilistic Context Free Grammars
[pdf]
[ps]
slides by Manning and Schütze [link to their site].
- Mar. 21, 2002...........Probabilistic Parsing
[ppt]
- Apr. 02, 2002...........Spoken Dialogue Systems
[ppt]
- Apr. 04, 2002...........Clustering
[ppt]
- Apr. 09, 2002...........Topics in Information Retrieval
[ppt]
- Apr. 11, 2002...........Introduction to Statistical Machine Translation
[ps] slides by Raghavendra Udupa (IBM, India Research Lab).
Class Discussions:
Readings:
Downloadable Assignments, Minors, Major and Project:
- Assignment 1 due Jan. 10, 2002.
- Assignment 2 due Jan. 17, 2002.
- Assignment 3 due Jan. 24, 2002.
- Assignment 4 due Feb. 12, 2002.
- Assignment 5 due Feb. 19, 2002.
- Assignment 6 due Mar. 05, 2002.
- Assignment 7 due Mar. 12, 2002.
- Minor 1 Feb. 12, 2002.
- Minor 2 Mar. 19, 2002.
- Major Apr. 18, 2002.
- Project due Apr. 16, 2002.
Scores:
Last Updated: April 23, 2002.