R4R: Reproducibility for R

Ensuring reproducibility is a fundamental challenge in computational research. Reproducing results often requires reconstructing complex software environments involving data files, external tools, system libraries, and language-specific packages. While various tools aim to simplify this process, they often rely on user-provided metadata, overlook system dependencies, or produce unnecessarily large environments.

R Markdown notebooks are a popular format for data analysis. In this project, we focus on the reproducibility of computational notebooks in R. We aim to automate the creation of minimal, user-inspectable, self-contained execution, distributable environments through dynamic program analysis techniques.

We created a tool, r4r, that captures all runtime dependencies of a data analysis pipeline and produces a Docker image capable of reproducing the original execution. Although designed with first-class support for the R programming language, r4r also includes a generic fallback mechanism applicable to other languages.

This project is financed by the ERC PoC grant 101081989 and spans from January 2024 to June 2025.

Pierre Donat-Bouillud
Pierre Donat-Bouillud
Researcher

My research interests including programming languages, fuzzing and testing.

Related