Parisian Master of Research in Computer Science
Master Parisien de Recherche en Informatique (MPRI)

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cours:c-1-32 [2019/02/05 15:32]
gambette [Short description (year 2018-2019)]
cours:c-1-32 [2019/09/19 18:06] (current)
gambette [Evaluation / exams (year 2019-2020)]
Line 7: Line 7:
 The lecturers are researchers in the [[http://igm.univ-mlv.fr/AlgoB/|Algorithmics for Bioinformatics]] group at LIGM (Univ. Paris-Est Marne-la-Vallée). The lecturers are researchers in the [[http://igm.univ-mlv.fr/AlgoB/|Algorithmics for Bioinformatics]] group at LIGM (Univ. Paris-Est Marne-la-Vallée).
  
-For year 2018-2019, the lecturers will be [[http://igm.univ-mlv.fr/~bulteau/|Laurent Bulteau]] and [[http://igm.univ-mlv.fr/~mweller/|Mathias Weller]].+For year 2019-2020, the lecturers will be [[https://www-igm.univ-mlv.fr/~koutcher/|Gregory Kucherov]] and [[http://igm.univ-mlv.fr/~gambette|Philippe Gambette]].
  
 ===== Practical information ===== ===== Practical information =====
Line 17: Line 17:
 ==== Location ==== ==== Location ====
  
-The lectures will take place at ENS Paris-Saclay (Cachan), in room C321, Cournot building (RER B Bagneux, see on [[https://www.google.fr/maps/dir/Centre+de+conduite+du+RER,+218+Avenue+Aristide+Briand,+92220+Bagneux/48.7896059,2.3270173/@48.7905692,2.3260014,18z/data=!4m10!4m9!1m5!1m1!1s0x47e67121112099b7:0xa30087aac7d14766!2m2!1d2.320376!2d48.792954!1m0!3e2!5i1|Google Maps]]).+The lectures will take place at ENS Paris-Saclay (Cachan), in room C505, Cournot building (RER B Bagneux, see on [[https://www.google.fr/maps/dir/Centre+de+conduite+du+RER,+218+Avenue+Aristide+Briand,+92220+Bagneux/48.7896059,2.3270173/@48.7905692,2.3260014,18z/data=!4m10!4m9!1m5!1m1!1s0x47e67121112099b7:0xa30087aac7d14766!2m2!1d2.320376!2d48.792954!1m0!3e2!5i1|Google Maps]]).
  
 ===== Motivations and main objectives ===== ===== Motivations and main objectives =====
  
-The objective of this course is to study the algorithmic approaches used in bioinformatics. The topics covered in 2017-2018 included genomic sequences (text algorithms, data structures) and phylogeny (reconstruction algorithms, combinatorics on trees).+The objective of this course is to study the algorithmic approaches used in bioinformatics. The topics covered in 2019-2020 include genomic sequences (text algorithms, data structures) and phylogeny (reconstruction algorithms, combinatorics on trees).
  
-The topics covered in 2018-2019 include scaffolding, reversal and transposition distances, scaffold filling and phylogenetics.+The topics covered in 2018-2019 included scaffolding, reversal and transposition distances, scaffold filling and phylogenetics.
  
-===== Short description (year 2018-2019) ===== +===== Short description (year 2019-2020) =====
-A total of 26 hours of lectures and exercise sessions, [[http://www.dptinfo.ens-cachan.fr/M1/edts1-1819|from September 17th, 2018 to November 5th, 2018]]: +
-  * Mathias Weller:  +
-    * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/lec1_basics.pdf|Lecture 1]]: molecular biology basics, sequencing, assembly, scaffolding +
-    * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/lec2_phylo.pdf|Lecture 2]]: phylogenetics, distance methods, maximum parsimony, maximum likelihood, supertrees, reconciliation, distances between trees, split networks, phylogenetic networks, tree containment +
-  * Laurent Bulteau : +
-    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/slides1.pdf|lecture 1]] +
-    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/slides2.pdf|lecture 2]] +
-    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td1.pdf|tutorial 1]] +
-    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td2.pdf|tutorial 2]] +
-    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td3.pdf|tutorial 3]] +
-  * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/project.txt|coding project]], [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/generator.py|generator]]+
  
-===== Short description (archive of year 2017-2018) ===== +A total of 26 hours of lectures and exercise sessions, [[http://www.dptinfo.ens-cachan.fr/M1/edts1-1920|from September 16th to November 4th2019]]: {{ :cours:upload:c1-32-course1.png?direct&100|}}
- +
-A total of 26 hours of lectures and exercise sessions, [[http://www.dptinfo.ens-cachan.fr/M1/edts1-1718|from September 11th, 2017 to November 6th2017]]: {{ :cours:upload:c1-32-course1.png?direct&100|}}+
   * Sequence processing for bioinformatics (Gregory Kucherov)    * Sequence processing for bioinformatics (Gregory Kucherov) 
-    * Lecture 1: +    * Lecture 1: Biological background and examples of bioinformatic problems  (Gregory Kucherov: evolution of methods of biosequence analysis 
-      * 1Molecular biology primer for computer scientists. Types of sequence data. Genome sequencing. +        * {{ :cours:upload:c-1-32-kucherov-lecture1.pdf|slides}} 
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture1-1.pdf|slides]] +    * Lecture 2: Sequence comparison (Gregory Kucherov: 
-      * 2) Sequence alignment. Global alignment, Needleman-Wunsch algorithm. Hirschberg algorithm. Local alignment, Smith-Waterman algorithm.  +        * dynamic programming for sequence alignmentNeedleman-Wunsch, Smith-Waterman & Hirschberg algorithms.  
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture1-2.pdf|slides]] +        * heuristic methodsseed-based approaches, BLAST-like algorithmsSpaced seeds
-    * Lecture 2:  +        Sketching approach to sequence comparisonMinHash 
-      * 1Hidden Markov models and their applications in bioinformatics. Viterbi algorithm. +    * Lecture 3: Data structures for bioinformatics (Gregory Kucherov) 
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture2-1.pdf|slides]] +        suffix tree, suffix array 
-      * 2) String matching and read mapping.  +        * compact data structures, Burrows-Wheeler transform, rank/select functionsFM-index 
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture2-2-1.pdf|slides]] +        * Bloom filters
-      3) Filtering. Seed-and-extend approach to approximate string matching. Seed design. +
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture2-2-2.pdf|slides]] +
-    * Lecture 3: +
-      1) Suffix tree. Suffix-tree-like indexes: DAWG, position heap. Suffix arraysenhanced suffix arrays. +
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture3.pdf|slides]] +
-     * Lecture 4: +
-      * 1) Compact data structuresrank-select functions. BWT index (FM-index). Bi-directional BWT index. +
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture4-1.pdf|slides]] +
-      * 2) Alignment-free sequence comparison. Locality-sensitive hashing. MinHash. Bloom filters+
-        * [[http://www-igm.univ-mlv.fr/~koutcher/lectures/lecture4-2.pdf|slides]]+
   * Phylogeny (Philippe Gambette) {{ :cours:upload:c1-32-course5.png?direct&100|}}   * Phylogeny (Philippe Gambette) {{ :cours:upload:c1-32-course5.png?direct&100|}}
     * Lecture 1:     * Lecture 1:
       * 1) Tree combinatorics: compatible characters, clusters, splits       * 1) Tree combinatorics: compatible characters, clusters, splits
       * 2) Phylogenetic reconstruction: distance methods       * 2) Phylogenetic reconstruction: distance methods
-      * {{:cours:upload:c-1-32-gambette-mpri2017coursphylogenieprint.pdf|Slides 1 to 11 of this file}} 
-      * {{:cours:upload:c1-32-gambette-mpri-exo-nj.pdf|Neighbor-Joining exercise}} 
     * Lecture 2:     * Lecture 2:
       * 1) Phylogenetic reconstruction : from clusters (Gusfield's algorithm)       * 1) Phylogenetic reconstruction : from clusters (Gusfield's algorithm)
       * 2) Phylogenetic reconstruction: maximum parsimony, maximum likelihood ; rooting       * 2) Phylogenetic reconstruction: maximum parsimony, maximum likelihood ; rooting
       * 3) Tree comparison: maximum agreement subtree, tree distances (Robinson-Foulds, quartet and triplet distance, NNI, TBR, SPR).       * 3) Tree comparison: maximum agreement subtree, tree distances (Robinson-Foulds, quartet and triplet distance, NNI, TBR, SPR).
-      * {{:cours:upload:c-1-32-gambette-mpri2017cours2phylogenie.pdf|}} 
     * Lecture 3:     * Lecture 3:
       * 1) Generation models of phylogenetic trees       * 1) Generation models of phylogenetic trees
       * 2) Efficient lowest common ancestor queries       * 2) Efficient lowest common ancestor queries
       * 3) Introduction to phylgenetic networks: hybridization number and relationship with SPR distance       * 3) Introduction to phylgenetic networks: hybridization number and relationship with SPR distance
-      * {{:cours:upload:c-1-32-gambette-mpri2017cours3phylogenie.pdf|}} 
     * Lecture 4: phylogenetic networks (+ written exam)     * Lecture 4: phylogenetic networks (+ written exam)
       * 1) Combinatorial properties       * 1) Combinatorial properties
Line 84: Line 57:
  
  
-===== Evaluation / exams (2018-2019) =====+===== Short description (archive of year 2018-2019) ===== 
 +A total of 26 hours of lectures and exercise sessions, [[http://www.dptinfo.ens-cachan.fr/M1/edts1-1819|from September 17th, 2018 to November 5th, 2018]]: 
 +  * Mathias Weller:  
 +    * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/lec1_basics.pdf|Lecture 1]]: molecular biology basics, sequencing, assembly, scaffolding 
 +    * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/lec2_phylo.pdf|Lecture 2]]: phylogenetics, distance methods, maximum parsimony, maximum likelihood, supertrees, reconciliation, distances between trees, split networks, phylogenetic networks, tree containment 
 +  * Laurent Bulteau : 
 +    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/slides1.pdf|lecture 1]] 
 +    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/slides2.pdf|lecture 2]] 
 +    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td1.pdf|tutorial 1]] 
 +    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td2.pdf|tutorial 2]] 
 +    * [[http://igm.univ-mlv.fr/~bulteau/teaching/ens/td3.pdf|tutorial 3]] 
 +  * [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/project.txt|coding project]], [[http://igm.univ-mlv.fr/~mweller/bioinfo2019/generator.py|generator]]
  
-To be announced+===== Evaluation / exams (year 2019-2020) =====
  
-===== Evaluation / exams (archive for year 2017-2018=====+The evaluation of the students will consist in a written exam (coefficient 2), as well as coding some of the algorithms studied during class (optional, coefficient 1).
  
-The evaluation of the students will consist in a written exam (coefficient 3)as well as coding some of the algorithms studied during class (coefficient 1).+The written exam will be 2 hour longon November 4th. The exercises will be given in English, and you can write your answers in English or in French. You'll be authorized to have your lecture notes with you (but no computer, cellphone, etc.)
  
-The written exam will be 2 hour longon November 6thThe exercises will be given in English, and you can write your answers in English or in French. You'll be authorized to have your lecture notes with you (but no computer, cellphone, etc.)+Students will also be in charge of coding one algorithms solving an algorithmic problem in phylogeneticsin PythonFurther indications will be given here on how to send the source code on a dedicated server to run them and compare the obtained results with the expected results.
  
-Students will also be in charge of coding two algorithms, each one solving a problem given for each part of this course, in the programming language they choose. Further indications will be given here on how to send the source code on a dedicated server to run them and compare the obtained results with the expected results. +The deadline for this programming task will be November 4th, 23:59, Paris time zone. The deadline may be extended only by request before the initial deadline at philippe.gambette@u-pem.fr.
- +
-The deadline for this programming task will be November 6th, 23:59, Paris time zone. The deadline may be extended only by request before the initial deadline at philippe.gambette@u-pem.fr.+
  
 Questions on the coding task can be sent to philippe.gambette@u-pem.fr. Questions on the coding task can be sent to philippe.gambette@u-pem.fr.
Line 103: Line 85:
 ===== Schedule ===== ===== Schedule =====
  
-  * 2017-09-1714:00-18:15Mathias Weller +  * 2019/09/168h30-12h45 Gregory Kucherov 
-  * 2017-09-2414:00-18:15Mathias Weller +  * 2019/09/238h30-12h45 Gregory Kucherov 
-  * 2017-10-0114:00-18:15Mathias Weller +  * 2019/09/308h30-12h45 Philippe Gambette 
-  * 2017-10-0814:00-18:15: Laurent Bulteau +  * 2019/10/078h30-12h45 Philippe Gambette 
-  * 2017-10-1514:00-18:15: Laurent Bulteau +  * 2019/10/148h30-12h45 Philippe Gambette 
-  * 2017-10-2214:00-18:15Laurent Bulteau +  * 2019/10/218h30-12h45 Gregory Kucherov 
-  * 2017-11-0514:00-18:15: Laurent Bulteau (including written exam)+  * 2019/11/048h30-12h45 Philippe Gambette + exam 
  
 ===== Prerequisites ===== ===== Prerequisites =====
Line 125: Line 108:
 ===== Bibliography ===== ===== Bibliography =====
  
-For 2018-2019: to be announced +For 2018-2019:
- +
-For 2017-2018:+
   * Lectures of Gregory Kucherov:   * Lectures of Gregory Kucherov:
     * Donald Adjeroh, Timothy Bell, Amar Mukherjee, [[http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-0-387-78908-8|The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching]], Springer, 2008.     * Donald Adjeroh, Timothy Bell, Amar Mukherjee, [[http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-0-387-78908-8|The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching]], Springer, 2008.
 
Universités partenaires Université Paris-Diderot
Université Paris-Saclay
ENS Cachan École polytechnique Télécom ParisTech
ENS
Établissements associés Université Pierre-et-Marie-Curie CNRS INRIA CEA