(Re)Evaluating Artifacts for Understanding Resource Artifacts

Project Idea Description

  • Topics: Virtualization, Containerization, Profiling, Reproducibility
  • Skills: C and Python and DevOps experience.
  • Difficulty: Medium
  • Size: Large; 350 hours
  • Mentors: Tanu Malik

This project aims to characterize computer-science related artifacts that are either submitted to conferences or deposited in reproducibility hubs such as Chameleon. We aim to characterize experiments into different types and understand reproducibility requirements of this rich data set, possibly leading to a benchmark. We will then understand packaging requirements, especially of distributed experiments and aim to instrument a package archiver to reproduce a distributed experiment. Finally, we will use learned experiment characteristics to develop a classifier that will determine alternative resources where experiment can be easily reproduced.

Project Deliverable Specific Tasks include: A pipeline consisting of a set of scripts to characterize artifacts. Packaged artifacts and an analysis report with open-sourced data about the best guidelines to package using Chameleon. A classifier system based on artifact and resource characteristics.

Tanu Malik
Tanu Malik
Associate Professor for Databases, High Performance and Scientific Computing, Systems Development

Tanu Malik is an Associate Professor at the School of Computing and directs the DICE Lab at DePaul (https://dice.cs.depaul.edu).