
The class takes place on thursdays from 12.45 to 15.45. in room 1014
To be decided with the students. The course material is in English.
Students will be allowed to take the exam in French or in English.
Computational linguistics employs mathematical models to represent
morphological,
syntactic, and semantic structures in
natural languages. The course introduces several such models
while insisting on their underlying logical structure and
algorithmics. Quite often these models will be related to
mathematical objects studied in other MPRI courses, for
which this course provides an original set of applications and
problems.
The course is not a substitute for
a full cursus in
computational linguistics; it rather aims at providing students with a rigorous
formal background in the spirit of MPRI. Most of the emphasis is put
on the symbolic treatment of words, sentences, and discourse. Several
fields within computational linguistics are not covered, prominently
speech processing and pragmatics. Machine learning techniques are
only very sparsely treated; for instance we focus on the mathematical
objects obtained through statistical and corpusbased methods (i.e. weighted automata and grammars) and
the associated algorithms, rather than on automated learning techniques (which is the subject of course 1.30).
We sketch here the planned contents for 2019–2020.
These contents are structured around three important subdomains of
linguistics, (morphology, syntax, and semantics),
presenting on each occasion some of the related models and the
corresponding algorithmic issues. The exact dates and content might
change.
September 12th, 2019
General Introduction Language has structure. Language and inference. The importance of ambiguity. Language and the world.
Linguistics basics for computational linguistics. Statistical properties of words, constituent and dependency analyses, computing semantic denotations and computing semantic similarities.
Machine learning basics for computational linguistics. Coding discrete symbols as vectors (word embeddings), optimisation reminder.
September 19th, 2019
Modelling sequences Presentation of typical problems involving sequence modelling.
Generative models language models, hidden markov models, PCFG
Discriminative models conditional random fields
Algorithms Viterbi and approximative methods
Deep learning based methods
September 26th, 2019 Modelling syntax
Phrase structure grammar
Tree adjoining Grammar
Dependency syntax
Categorial grammar
October 3rd, 2019 Parsing algorithms for natural language
CKY and Earley Introduction to weighted CKY and Earley
Shift Reduce and Eisner for Dependency syntax
CKY for tree adjoining grammar
October 17th, 2019
October 24th, 2019
November 7th, 2019
November 14th, 2019 Discourse Analysis discourse representation theory, anaphora resolution, typetheoretic dynamic logic
November 21st, 2019




Readings choose two blocks out of 3:
Distributional and Vector semantics :
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean (2013), Efficient Estimation of Word Representations in Vector Space, NIPS 2013 ( link)
Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng and Christopher Potts (2013), Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, EMNLP 2013 ( link)
Mildly Context sensitive languages :
Aravind Joshi (1985), How much context sensitivity is required to provide reasonable structural descriptions ? ( link)
Alexander Clark (2015), An introduction to multiple context free grammars for linguists ( link)
Semantic parsing with distant supervision :
Luke S. Zettlemoyer and Michael Collins (2009), Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars ( link)
Jonathan Berant Andrew Chou Roy Frostig Percy Liang (2013) Semantic Parsing on Freebase from QuestionAnswer Pairs, EMNLP 2013. ( link)




Lecture notes for the second half (roughly courses 58), last updated Nov. 30 2011.


Slides from 2010 for the second part.

Basics in formal language theory (regular word languages, sequential functions, contextfree languages, regular tree languages)
Elementary notions in logics
Some fluency in lambda calculus
1.18 Tree Automata and Applications: regular tree languages, monadic secondorder logic on trees, potentially pushdown tree languages,
1.30 Machine Learning: as already mentioned, this course does not cover learning techniques
1.24 Probabilistic Aspects of Computer Science: Markov chains,
2.16 Modélisation par automates finis: rational relations and rational series.
Jean Berstel. Transductions and ContextFree Languages, Teubner Studienbücher: Informatik, Teubner, 1979. webpage
Jacques Sakarovitch. Elements of Automata Theory, Cambridge University Press, 2009. Translated from Éléments de théorie des automates, Vuibert Informatique, 2003.
Hubert Comon, Max Dauchet, Rémi Gilleron, Christof Löding, Florent Jacquemard, Denis Lugiez, Sophie Tison, and Marc Tommasi. Tree Automata Techniques and Applications, 2007. webpage
Daniel Jurafsky and James H. Martin. Speech and Language Processing, Prentice Hall Series in Artificial Intelligence, Prentice Hall, second edition, 2009.
Mitkov, ed. The Oxford handbook of computational linguistics, Oxford University Press, 2003.
Jackendoff, Ray. Foundations of language: brain, meaning, grammar evolution, Oxford University Press, 2002.
Bob Carpenter. TypeLogical Semantics, MIT Press. 1998.
Johan van Benthem and Alice ter Meulen, eds. Handbook of Logic and Language, Elsevier Science, 1997.
Patrick Blackburn and Johan Bos. Representation and Inference for Natural Language, A First Course in Computational Semantics, CSLI, 2005.
Christian Retoré. The Logic of Categorial Grammars: Lecture Notes. webpage
Shuly Wintner, Nissim Francez, Unification Grammars, Cambridge University Press 2012.
