Reproducibility of Interactive Notebooks in Distributed Environments

Reproducibility

Hello! I am Raza, currently a Ph.D. student in Computer Science at DePaul University. This summer, I will be working on reproducibility of notebooks in distributed enviornments, mentored by Prof. Tanu Malik. Here is a summary of my project proposal.

Interactive notebooks are web-based systems which enable encapsulating code, data, and their outputs for sharing and reproducibility. They have gained wide popularity in scientific computing due to their ease of use and portability. However, reproducing notebooks in different target environments remains challenging because notebooks do not carry the computational environment in which they are executed. This becomes even more challenging in distributed cluster environments where a notebook must be prepared to run on multiple nodes. In this project, we plan to (i) extend FLINC, an open-source user-space tool for distributed environments such that it can package notebook executions into notebook containers for execution and sharing across distributed environments, and (ii) integrate the extended Flinc with TaskVine, which provides the framework and orchestration to enable distributed notebook execution in high performance computing environments.

You can read my complete proposal here.

I am excited to work on this project and learn from the experience here!

Raza Ahmad
Raza Ahmad
Ph.D. Student at DePaul University