Parisian Master of Research in Computer Science
Master Parisien de Recherche en Informatique (MPRI)

Logical and Computational Structures for Linguistic Modeling

Structures Informatiques et Logiques pour la Modélisation Linguistique, (24h, 3ECTS). Taught in round-robin by Éric Villemonte de la Clergerie (INRIA Rocquencourt) Philippe de Groote (INRIA Lorraine) and Sylvain Schmitz (ENS Cachan).

Year 2018–2019

Teaching Staff

This year, the first part of the course is taught by Sylvain Schmitz (ENS Cachan). The second half is taught by Philippe de Groote (INRIA Lorraine).

Schedule

Wednesdays at 12:45, Sophie Germain, starting on Wednesday, September 12th 2017, room 1003.

Language

To be decided with the students. The course material is in English. Students will be allowed to take the exam in French or in English.

Description

Computational linguistics employs mathematical models to represent morphological, syntactic, and semantic structures in natural languages. The course introduces several such models while insisting on their underlying logical structure and algorithmics. Quite often these models will be related to mathematical objects studied in other MPRI courses, for which this course provides an original set of applications and problems.

The course is not a substitute for a full cursus in computational linguistics; it rather aims at providing students with a rigorous formal background in the spirit of MPRI. Most of the emphasis is put on the symbolic treatment of words, sentences, and discourse. Several fields within computational linguistics are not covered, prominently speech processing and pragmatics. Machine learning techniques are only very sparsely treated; for instance we focus on the mathematical objects obtained through statistical and corpus-based methods (i.e. weighted automata and grammars) and the associated algorithms, rather than on automated learning techniques (which is the subject of course 1.30).

Tentative Outline

We sketch here the planned contents for 2018–2019. These contents are structured around three important subdomains of linguistics, (morphology, syntax, and semantics), presenting on each occasion some of the related models and the corresponding algorithmic issues. The exact dates and content might change.

  1. September 12th, 2018
    • General Introduction
      the subdomains of linguistics, the issues of linguistic modelling, and the various computational approaches
    • Syntactic Modelling
      constituent and dependency analyses
    • Context-free Parsing
      parsing as intersection
  2. September 26th, 2018
    • Model-Theoretic Syntax
      monadic second-order logic and propositional dynamic logic on trees, automata characterisations
  3. October 3rd, 2018
    • Mildly Context-Sensitive Syntax
      tree-adjoining grammars, well-nested MCSLs
  4. October 10th, 2018
    • Probabilistic Syntax
      probabilistic CFGs, learning, probabilistic parsing
  5. October 17th, 2018
    • Semantic Representations
      modal logics, higher-order logics
  6. October 24th, 2018
    • Syntax/Semantics Interface
      compositionality, higher-order syntax, abstract categorial grammars
  7. November 7th, 2018
    • Montague Semantics
      model-theoretic semantics, intensionality
  8. November 14th, 2018
    • Discourse Analysis
      discourse representation theory, anaphora resolution, type-theoretic dynamic logic
  9. November 21st, 2018
    • exam

Course Material

2018–2019
2017–2018
2016–2017
2015–2016
2014–2015
2013–2014
2012–2013
2011–2012
Older Material

To Know More

Requisites

  • Basics in formal language theory (regular word languages, sequential functions, context-free languages, regular tree languages)
  • Elementary notions in logics
  • Some fluency in lambda calculus

Related Courses

  • 1.18 Tree Automata and Applications: regular tree languages, monadic second-order logic on trees, potentially pushdown tree languages,
  • 1.30 Machine Learning: as already mentioned, this course does not cover learning techniques
  • 1.24 Probabilistic Aspects of Computer Science: Markov chains,
  • 2.16 Modélisation par automates finis: rational relations and rational series.

References

  • Jean Berstel. Transductions and Context-Free Languages, Teubner Studienbücher: Informatik, Teubner, 1979. webpage
  • Jacques Sakarovitch. Elements of Automata Theory, Cambridge University Press, 2009. Translated from Éléments de théorie des automates, Vuibert Informatique, 2003.
  • Hubert Comon, Max Dauchet, Rémi Gilleron, Christof Löding, Florent Jacquemard, Denis Lugiez, Sophie Tison, and Marc Tommasi. Tree Automata Techniques and Applications, 2007. webpage
  • Daniel Jurafsky and James H. Martin. Speech and Language Processing, Prentice Hall Series in Artificial Intelligence, Prentice Hall, second edition, 2009.
  • Mitkov, ed. The Oxford handbook of computational linguistics, Oxford University Press, 2003.
  • Jackendoff, Ray. Foundations of language: brain, meaning, grammar evolution, Oxford University Press, 2002.
  • Bob Carpenter. Type-Logical Semantics, MIT Press. 1998.
  • Johan van Benthem and Alice ter Meulen, eds. Handbook of Logic and Language, Elsevier Science, 1997.
  • Patrick Blackburn and Johan Bos. Representation and Inference for Natural Language, A First Course in Computational Semantics, CSLI, 2005.
  • Christian Retoré. The Logic of Categorial Grammars: Lecture Notes. webpage
  • Shuly Wintner, Nissim Francez, Unification Grammars, Cambridge University Press 2012.
 
Universités partenaires Université Paris-Diderot
Université Paris-Saclay
ENS Cachan École polytechnique Télécom ParisTech
ENS
Établissements associés Université Pierre-et-Marie-Curie CNRS INRIA CEA