Parisian Master of Research in Computer Science
Master Parisien de Recherche en Informatique (MPRI)

Querying Data: Foundations and Practice (24H, 3 ECTS)

Coordinator : Leonid Libkin

Context

Database theory has undergone several important transformations which are rarely reflected in courses, while having already found applications in database practice. In the 1980s and 90s the backbone of database theory was finite model theory, which has now been developed to the point of a collection of off-the-shelf tools that can be applied to analyze complexity and expressiveness of query languages. Latter times brought in semi-structured data and with it techniques based on word and tree automata. The last 10-15 years have brought us three new directions:

1. At a purely theoretical level, a uniform treatment of several different areas such as relational joins and conjunctive queries and incomplete data via a rich theory of graph homomorphisms.

2. At a more practical level, the theory of worst-case optimal join algorithms, that made it from theory papers to mainstream systems in less than a decade, and led to rethinking of some principles on which SQL databases are built.

3. At even more practical level, the emergence of graph databases as a key new industry, both native ones and represented via relational, all requiring a different approach to language design.

Outline

Below is a preliminary breakdown into lectures, with 8 lectures of 3 hours each. The FIRST DAY OF CLASSES IS 20 DECEMBER. The class on the 31st of January will be online as the instructor will be away at a conference.

  1. http://libk.in/MPRI/Lecture1.pdf : Foundations of relational databases: First-order logic, Relational algebra, Codd's theorem, Static analysis (Trakhtenbrot’s theorem).
  2. http://libk.in/MPRI/Lecture2.pdf : Joins and conjunctive queries; graph homomorphisms; evaluation, static analysis, minimization and graph cores.
  3. http://libk.in/MPRI/Lecture3.pdf : Fast conjunctive query evaluation: AGM bound and worst-case optimal joins.
  4. http://libk.in/MPRI/Lecture4.pdf : Expressiveness and complexity of query languages: locality, zero-one law, basics of descriptive complexity.
  5. Lecture 5. SQL programming, as seen through foundational lens.
  6. Lecture 6. Incomplete information: theory and real life.
  7. Lecture 7. Graph querying: theory (RPQ and similar), practice (Cypher/GQL and SQL/PGQ).
  8. Lecture 8. Relational graph querying (Rel).

Documents

The main source is Database Theory: Querying Data by Marcelo Arenas, Pablo Barceló, Leonid Libkin, Wim Martens, and Andreas Pieris.

The book is quasi-finished and freely available at https://github.com/pdm-book/community.

This will be supplemented by

  • Elements of Finite Model Theory by L. Libkin (freely available to students in class [https://homepages.inf.ed.ac.uk/libkin/fmt/fmt.pdf here])
  • Graphs and Homomorphisms by P. Hell and J. Nešetril
  • SQL/Cypher/Rel freely available documentation and guides

Evaluation

Research-oriented take-home exams. Please see this document (http://libk.in/MPRI/list_of_papers_mpri.pdf) for detailed instructions.

Prerequisites

Technically speaking, just good general CS background, although this being a Master-level database class covering new topics, it would be helpful to have some basic understanding of the classics, e.g. first-order logic, relational algebra, some knowledge of SQL.

Related courses

Bibliography

See Documents above

Teaching team

 
Universités partenaires Université Paris-Diderot
Université Paris-Saclay
ENS Cachan École polytechnique Télécom ParisTech
ENS
Établissements associés Université Pierre-et-Marie-Curie CNRS INRIA CEA