Applying MLOps to overcome reproducibility barriers in machine learning research

  • Topics: machine learning, MLOps, reproducibility
  • Skills: Python, machine learning, GitOps, systems, Linux, data, Docker
  • Difficulty: Hard
  • Size: Large (350 hours)
  • Mentors: Fraida Fund and Mohamed Saeed

Project Idea Description

Reproducibility remains a significant problem in machine learning research, both in core ML and in the application of ML to other areas of science. In many cases, due to inadequate experiment tracking, dependency capturing, source code versioning, data versioning, and artifact sharing, even the authors of a paper may find it challenging to reproduce their own study several years later. This makes it difficult to vaidate and build on previous work, and raises concerns about its trustworthiness.

In contrast, outside of academic research, MLOps tools and frameworks have been identified as a key enabler of reliable, reproducible, and trustworthy machine learning systems in production. A good reference on this topic is:

Firas Bayram and Bestoun S. Ahmed. 2025. Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach. ACM Comput. Surv. 57, 5, Article 121 (May 2025), 35 pages. https://doi.org/10.1145/3708497

This project seeks to bridge the gap between widely adopted practices in industry and academic research:

  • by making it easier for researchers and scientists to use MLOps tools to support reproducibility. To achieve this, we will develop starter templates and recipes for research in computer vision, NLP, and ML for science, that have reproducibility “baked in” thanks to the integration of MLOps tools and frameworks. Researchers will launch these templates on open access research facilities like Chameleon.
  • and, by developing complementary education and training materials to emphasize the important of reproducibility in ML, and how the tools and frameworks used in the starter templates can support this goal.

Writing a successful proposal for this project

A good proposal for this project should -

  • demonstrate a good understanding of the current barriers to reproducibility in machine learning research (specific examples are welcome),
  • describe a “base” starter template, including the platforms and tools that will be integrated, as well as specific adaptations of this template for computer vision, NLP, and ML for science,
  • explain the “user flow” - how a researcher would use the template to conduct an experiment or series of experiments, what the lifecycle of that experiment would look like, and how it would be made reproducible,
  • include the contributor’s own ideas about how to make the starter templates more usable, and how to make the education and training materials relatable and useful,
  • and show that the contributor has the necessary technical background and soft skills to contribute to this project. In particular, the contributor will need to create education and training materials that are written in a clear, straightforward, and concise manner, without unncessary jargon. The proposal should show evidence of the contributor’s writing abilities.

Github link

There is no pre-existing Git repository for this project - at the beginning of the summer, the contributor will create a new repository in the Teaching on Testbeds organization, and the project materials will “live” there.

Fraida Fund
Fraida Fund
Research Assistant Professor, NYU Tandon School of Engineering

Fraida Fund is interested in using open access experimental platforms like FABRIC, Chameleon, and CloudLab to support reproducibility.