CS 224N / Ling 280 - Natural Language Processing: Course Description
CS 224N / Ling 280 - Natural Language Processing: Course Description
Course Description
This course is designed to introduce students to the fundamental concepts and ideas in natural language processing (NLP),
and to get them up to speed with current research in the area. It develops an in-depth understanding of both the algorithms
available for the processing of linguistic information and the underlying computational properties of natural languages. Word-
level, syntactic, and semantic processing from both a linguistic and an algorithmic perspective are considered. The focus is on
modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, disambiguation, and parsing.
Also, it examines and constructs representative systems.
Prerequisites
• Adequate experience with programming and formal structures (e.g., CS106B/X and CS103B/X).
• Programming projects will be written in Java 1.5, so knowledge of Java (or a willingness to learn on your own) is
required.
• Knowledge of standard concepts in artificial intelligence and/or computational linguistics (e.g., CS121/221 or Ling
180).
• Basic familiarity with logic, vector spaces, and probability.
Intended Audience
Graduate students and advanced undergraduates specializing in computer science, linguistics, or symbolic systems.
• Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language
Processing, Computational Linguistics and Speech Recognition. Second Edition. Prentice Hall.
The book won't be able in time for the class. (June 2008 update: it's now available for purchase!) We will use a
reader containing parts of the second edition. The reader is available for ordering at University Readers. You order
it online and they ship it to you. The cost is $40.58. [Detailed purchasing instructions.] Once you've ordered it, you
can have access to the first couple of chapters that we'll use online for free. If you have any difficulties, please e-
mail orders@universityreaders.com or call 800.200.3908, and email the class email list. It's referred to as J&M in the
syllabus. [Book website]
• Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT
Press.
Buy it at the Stanford Bookstore (recommended class text) or Amazon ($64 new).
Please see http://nlp.stanford.edu/fsnlp/ for supplementary information about the text, including errata, and
pointers to online resources.
Other papers with relevant material will occasionally be posted or distributed for appropriate class lectures.
Copies of in-class hand-outs, such as readings and programming assignments, will be posted on the syllabus, and hard copies
will also be available outside Gates 158 (in front of Prof. Manning's office) while supplies last.
There will be three substantial programming assignments, each exploring a core NLP task. They are a chance to see real, close
to state-of-the-art tools and techniques in action, and where students learn a lot of the material of the class.
Finally, there will be simple weekly online quizzes, which will aim to check that you are thinking about what you hear/read.
Course grades will be based 60% on programming assignments (20% each), 8% on the quizzes, and 32% on the final project.
Lecture Introduction Overview of NLP. Statistical machine translation. Language models and their role in speech
1 processing. Course introduction and administration.
Wed No required reading.
4/2/08 Good background reading: M&S 1.0-1.3, 4.1-4.2, Collaboration Policy
Optional reading on Unix text manipulation (useful skill!): Ken Church's tutorial Unix for Poets
(If your knowledge of probability theory is limited, also read M&S 2.0-2.1.7. If that's too condensed, read the
probability chapter of an intro statistics textbook, e.g. Rice, Mathematical Statistics and Data Analysis, ch. 1.)
Distributed today: Programming Assignment 1
Section 1 Smoothing
Fri Smoothing: absolute discounting, proving you have a proper probability distribution, Good-Turing
4/11/08 implementation. Information theory examples and intuitions. Java implementation issues.
Lecture Syntax and Parsing for Context-Free Grammars (CFGs) Parsing, treebanks, attachment ambiguities. Context-
10 free grammars. Top-down and bottom-up parsing, empty constituents, left recursion, and repeated work.
Mon Probabilistic CFGs.
5/5/08 Assigned reading: J&M ch. 13, secs. 13.0-13.3.
Background reading: J&M ch. 9 (or M&S ch. 3). This is especially if you haven't done any linguistics courses, but
even if you have, there's useful information on treebanks and part-of-speech tag sets used in NLP.
Lecture Dynamic Programming for Parsing Dynamic programming for parsing. The CKY algorithm. Accurate
11 unlexicalized PCFG parsing.
Wed Assigned reading: J&M sec. 13.4
5/7/08 Additional information: Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. ACL 2003,
pp. 423-430.
Due today: final project proposals
• Eugene Charniak (1997), Statistical techniques for natural language parsing, AI Magazine.
• Eugene Charniak (1997), Statistical parsing with a context-free grammar and word statistics,
Proceedings of the Fourteenth National Conference on Artificial Intelligence. AAAI Press/MIT Press,
Menlo Park (1997).
• Eugene Charniak (2000), A Maximum-Entropy-Inspired Parser, Proceedings of NAACL-2000.
• Dan Klein and Christopher D. Manning. 2002. A Generative Constituent-Context Model for Improved
Grammar Induction. Proceedings of the 40th Annual Meeting of the Association for Computational
Linguistics, pp. 128-135.
• Dan Klein and Christopher D. Manning. 2002. Natural Language Grammar Induction using a
Constituent-Context Model. In Thomas G. Dietterich, Suzanna Becker, and Zoubin Ghahramani (eds),
Advances in Neural Information Processing Systems 14 (NIPS 2001). Cambridge, MA: MIT Press, vol. 1,
pp. 35-42.
• Dan Klein and Christopher D. Manning. 2003. Factored A* Search for Models over Sequences and
Trees. IJCAI 2003.
• Dan Klein and Christopher D. Manning. 2003. A* Parsing: Fast Exact Viterbi Parse Selection. HLT-
NAACL 2003.
• Kristina Toutanova, Christopher D. Manning, Stuart M. Shieber, Dan Flickinger, and Stephan Oepen.
2002. Parse Disambiguation for a Rich HPSG Grammar. First Workshop on Treebanks and Linguistic
Theories (TLT2002), pp. 253-263. Sozopol, Bulgaria.
• Kristina Toutanova, Christopher D. Manning, Dan Flickinger, and Stephan Oepen. 2005. Stochastic
HPSG Parse Disambiguation using the Redwoods Corpus. Research in Language and Computation
2005.
• B. Taskar, D. Klein, M. Collins, D. Koller and C. Manning. Max-Margin Parsing. Empirical Methods in
Natural Language Processing (EMNLP04), Barcelona, Spain, July 2004. Received best paper award.
• Eugene Charniak and Mark Johnson (2005). Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative
Reranking, Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics
(ACL 2005)
• Ryan McDonald, Koby Crammer and Fernando Pereira (2005). Online Large-Margin Training of
Dependency Parsers. 43rd Annual Meeting of the Association for Computational Linguistics, ACL
2005.