Research Interests

Integration of Big and heterogeneous biological and biomedical data

Provenance in scientific workflows FAIR workflows and data

Querying and ranking biological & biomedical data

Workflow/dataflow design, software engineering, user requirements

Semantic Web, metadata, quality

Current Research Projects on Ranking Bio data

Rank aggregation of complete datasets (following our Mastodons projects, 2016-2017-2018).
  • Topic: Ranking biological data using rank aggegation techniques.
  • Collaborators : Laboratoire d'Informatique de Marnes la vallée, IFB (Institut Francais de Bioinformatique, Gif-sur-Yvette), APHP Paul-Brousse, APHP George Pompidou
  • Publications: e-Science [53]
  • Tools involved: ConQur-Bio [39], CorankCo
  • Master or PhD student involved: Pierre Andrieu
  • Previous related projects or past PhDs involved: RankaBio (PEPS Fascido call, 2015), PhD of Bryan Brancotte [42], [43]

Current Research Projects on Scientific Workflows

R2-P2 : Reuseable and Reproducible Scientific Protocols
CNRS PRIME Research project
  • Topic: Making protocols and scientific workflows FAIR, reusable and reproducible 
  • Partners: Institut du Thorax (Nantes), Institut Pasteur (C3BI, Paris), Université Paris-Dauphine (LAMSADE) 
  • Works on FAIR workflows in collaboration with C. Goble (The University of Manchester), D. Garijo (University of Southern California) 
  • PhD student involved: Marine Djaffarjy (CNRS PRIME)
  • See the full webpage of the project!
Analysing Plant phenotyping data with scientific workflows
  • Topic: Designing and executing scientific workflows to analyse highly complex and big plants datasets.
  • Collaborators: Inria VirtualPlants (in particular, Christophe Pradal) and Zenith (in patricular Patrick Valduriez) groups at the Institute of Computational Biology, Montpellier; INRA Montpellier (in particular, Pascal Neuveu).
  • Tools involved: OpenAlea (developped by Christophe Pradal et al.) [45], InfraPhenoGrid (developped by Christophe Pradal et al.) [47]
  • Previous related projects: Junior project IBC grant 
Next Generation
  • Topic: With 50,000 data analysis per month and more than 1,500 citations (google scholar), the phylogenetic analysis pipeline is one of the most visible French IT resources both at the national and international levels. is now used for teaching, inducing possibly hundreds of users at the same time, or employ it in batch mode leading to the submission of large amount of requests to the same server. In this project, we thus plan to increase the robustness of The originality of the new version of lies in considering a scientific workflow environment (Galaxy) coupled with a web interface allowing visualization and interaction with phylogenetic objects. More precisely, this project will provide (i) a large set of phylogenetic analysis bricks and for each brick, access to diverse programs, all encapsulated into Galaxy thus making the system able to deal with large groups of users and/or large sets of data, (ii) a set of optimized, robust and expressive workflows extending the basic phylogenetic workflow to various and rich contexts of phylogenetic analyses, (iii) an easy-to-install environment equipped with a new visualization layer, on top of the Galaxy system, and dedicated to phylogenetic analyses.
  • Publication in the NAR Journal
  • Groups involved: Institut Pasteur, LIRMM, LRI, IGS
  • Previous related projects: DistillFlow, refactoring scientific workflows [40], [35], [37].