Maëva Caillat

Ph.D thesis of Maëva Caillat (2025-2028)

Meta-analysis and aggregation of heterogeneous data to consolidate prediction models: application in food microbiology and food safety (Supervision: Jeanne-Marie Membré and Louis Delaunay)

Maeva Caillat

Producers, processors, and distributors in the agri-food sector must guarantee the safety and quality of the food products they place on the market. To do so, they can use mathematical predictions based on microbiological knowledge and data from the literature. GIS Sym'Previus provides food microbiology prediction tools that can be used by these stakeholders (https://symprevius.eu/fr/). These tools can also facilitate the development of new recipes, identify microbiological hazards in new products, and qualify the use of new ingredients. These needs are becoming increasingly important in the context of ecological transition, which involves industrial, food, and energy transition.
In this context, it is important to diversify and consolidate the information used to produce simulations of microbial behavior in order to ensure food safety and quality. There are few tools available to stakeholders in the agri-food sector, and they require continuous updating. The literature is full of data that can be exploited. However, it is difficult to exploit directly because it comes from different products, different strategies or methodologies, both biological and statistical, and is of different types because the behaviors studied are diverse (growth, inactivation).

The models used in predictive microbiology must be continuously enriched with data from scientific literature or gray literature (e.g., EFSA, FDA). 
However, there is a wide variety of responses studied (growth rate, latency, growth/non-growth), which are also subject to variability due to experimental conditions (strains, species, media, laboratory reproduction of the process or formulation, etc.) and uncertainty (measurement error, more or less optimized experimental design, etc.). It is necessary to establish a robust and rigorous statistical methodology to continuously enrich existing predictive microbiology tools while overcoming these potential “biases.”

The work will be divided into three phases:

  1. Develop statistical methods for aggregating different types of data. 
    This involves collecting data in a semi-automated manner (data mining), particularly plot data, taking into account new environmental factors that explain the growth or inactivation of microorganisms, estimating and taking into account experimental variability (species, strains, serovars, media, etc.), and applying a confidence index based on data quality.
  2. Develop quantitative approaches (models) to estimate predictive microbiology parameters. In order to enrich the database produced by the Sym'Previus GIS, the data collected in step 1 will need to be integrated and aggregated with existing data. This modeling stage (with potential extrapolation) will use probabilistic models to integrate the biological variability of microorganisms, the physicochemical variability of foods, and the uncertainty of predictions. Simulations of the behavior of different serovars, strains, and species on a wider variety of foods can be carried out. 
  3. Explore different approaches based on machine learning or artificial intelligence:
    In addition to these statistical methods, we will explore digital approaches such as machine learning and artificial intelligence to assess whether they can provide additional information or partially replace the modeling/simulation tools currently used in the Sym'Previus software.