The goal of this committee is to encourage and promote reproducible research within the SC community. To that end, we aim to assist SC authors in providing us with the necessary documentation that describes your artifact and help us evaluate it so we can assign a badge to the artifact.
Reproducibility Co-ChairRafael Tolosana-Calasanz, University of Zaragoza, Spain
Reproducibility Co-ChairJay Lofstead, Sandia National Laboratories
You will be making it easy for other researchers to compare with your work, to adopt and extend your research. This instantly means more recognition directly visible through badges for your work and higher impact. As described in this SC20 survey, thirty-five percent (35%) of the respondents have used the appendices information from papers in the SC conference proceedings in their research. There will be a general announcement for all accepted papers reproduced and badged.
We use the terms artifact and research object as synonyms. A paper consists of several computational artifacts that extend beyond the submitted article itself: software, datasets, environment configuration, mechanized proofs, benchmarks, test suites with scripts, etc. A complete artifact package must contain (1) the computational artifacts, and (2) instructions/documentation describing the contents and how to use it. Further details and guidance about artifacts and research objects are provided all through this page.
The review process will take place in two stages:
The Artifact Description will be checked for completion and accessibility, and in Phase 2, Artifact Evaluation will happen for accepted papers that applied for badges. In conjunction with the paper submission, Artifact Descriptions are mandatory. The artifact description provides information about a paper’s artifacts. All SC papers must provide (1) an artifact description, or (2) provide a reason why an artifact description can not be provided (see below for the Artifact Description criteria).
APR 20, 2023: AD (mandatory) (Two weeks after the paper submission deadline)
Based on the reviews of the AD, an updated version of the AD needs to be submitted. At this phase, it is optional to submit the Artifact Evaluation, and all the computational artifacts required for the reproducibility of experiments (e.g. any code / data artifacts used). When the AE is submitted, authors need to apply for any of the reproducibility badges available. Then, the AD/AE committee starts to evaluate the artifact. This step relies on cooperation between paper authors and the committee. We will the SC Conference Submission System for single-blind messaging between the two parties. Via this communication, the committee may ask for access to special hardware, ask for missing components, provide failure/error messages, and generally keep the author’s posted on their evaluation status. Phase 2 finishes with the artifact freeze, authors need to assign a DOI (What is a DOI? Check this out) to their artifacts to guarantee no further modifications are possible to their artifacts.
JUN 20, 2023: Revised AD submission (mandatory) and AE (optional) submission deadline
JUN 23, 2023: Artifact badge evaluation starts for accepted papers
AUG 10, 2023: Artifact freeze, authors assign a DOI to their Artifact
AUG 19, 2023: Artifact badge decision
The Artifact Description/Artifact Evaluation (AD/AE) process is single-blind, unlike paper submissions which are double-blind reviewed. Authors do not need to remove identifying information from artifacts or papers.
The Committee may provide feedback to authors in a single-blind arrangement. The AD/AE Committee will not share any information with the Program Committee other than to confirm whether artifacts meet the criteria.
How to Submit
Artifacts are submitted via the Artifact Description submission form (Available Winter 2022). Submission includes application for badging in the second stage. Artifact freeze means the artifact must not be changed after this time, or a tagged version be provided.
The Artifact Description is a mandatory step for all submitted papers. All authors must provide descriptions of their artifacts. The Artifact Description (AD) Appendix will be auto-generated from author responses to a standard form embedded in the online submission system. It must include the following aspects:
Including (i) the article’s title, (ii) the author’s names and affiliations, and (iii) an abstract describing the main contributions of the article and the role of the computational artifact(s) in these contributions. The abstract may include a software architecture or data models and its description to help the readers understand the computational artifact(s) and a clear description on to what extent the computational artifact(s) contribute(s) to the reproducibility of the experiments in the article.
Including (i) a complete description of the experiment workflow that the code can execute, (ii) an estimation of the execution time to execute the experiment workflow, (iii) a complete description of the expected results and an evaluation of them, and most importantly (iv) how the expected results from the experiment workflow relate to the results found in the article. Best practices indicate that, to facilitate the understanding of the scope of the reproducibility, the expected results from the artifact should be in the same format as the ones in the article. For instance, when the results in the article are depicted in a graph figure, ideally, the execution of the code should provide a (similar) figure (there are open-source tools that can be used for that purpose such as gnuplot). It is critical that authors devote their efforts on these aspects of the reproducibility of experiments to minimize the time needed for their understanding and verification.
If you are unable to provide an artifact description (e.g. due to proprietary reasons), please provide detailed reasons why you are not able to do so. Failure to provide detailed description will lead to further questions from the AD/AE Committee. The AD/AE Committee will provide their feedback to the SC Technical Program Committee, and inadequate explanations will be taken against the overall paper review.
The AD/AE Committee will evaluate completion of the form and artifact accessibility from any links included as part of the form, for instance, links to the computational artifacts.
The Artifact Evaluation is an optional step for all accepted papers. It allows authors to apply for a reproducibility badge. If you wish to acquire a badge for your artifact, you must choose appropriate badges in AD/AE form. The Artifact Evaluation (AE) Appendix will extend the contents of the AD and they will both be included as an Appendix. The AE will be also auto-generated from author responses to a standard form embedded in the online submission system. In addition to the points 1) and 2) from the AD, the AE must include the following aspects:
Including (i) a description of the hardware resources required, (ii) a description of the operating systems required, (iii) the software libraries needed, (iv) the input dataset needed to execute the code or when the input data is generated, and (v) optionally, any other dependencies or requirements. Best practices to facilitate the understanding of the descriptions indicate that unnecessary dependencies and requirements should be suppressed from the artifact.
Including (i) the process description to install and compile the libraries and the code, and (ii) the process description to deploy the code in the resources. The description of these processes should include an estimation of the installation, compilation, and deployment times. When any of these times exceed what is reasonable, authors should provide some way to alleviate the effort required by the potential recipients of the artifacts. For instance, capsules with the compiled code can be provided, or a simplified input dataset that reduces the overall experimental execution time. On the other hand, best practices indicate that, whenever it is possible, the actual code of software dependencies (libraries) should not be included in the artifact, but scripts should be provided to download them from a repository and perform the installation.
Including other related aspects that can be important and were not addressed in the previous points.
In addition to the AD/AE, when applying for a reproducibility badge, computational artifacts also need to be submitted for the reproducibility of experiments.
The computational artifacts of a paper include all the elements that support the reproducibility of its experiments, such as software, datasets, environment configuration, mechanized proofs, benchmarks, test suites with scripts, etc. Authors can choose any version-controlled software and data repositories to share their artifacts, such as Zenodo, FigShare, Dryad, Software Heritage, GitHub, or GitLab.
The AD/AE, in addition to documenting the computational artifacts, will also include links to the required repositories. If needed, README files can also be attached to the computational artifacts, either containing the same information as in the AD/AE or complementing it, for instance, providing further and more detailed instructions and documentation. As a general rule, authors should try to do their best to simplify the reproducibility process, to save committee members the burden of reverse-engineering the authors’ intentions. For example, a tool without a quick tutorial is generally very difficult to use. Similarly, a dataset is useless without some explanation on how to browse the data. For software artifacts, the AD/AE and the README should—at a minimum—provide instructions for installing and running the software on relevant inputs. For other types of artifacts, describe your artifact and detail how to “use” it in a meaningful way.
Importantly, make your claims about your artifacts concrete. This is especially important if you think that these claims differ from the expectations set up by your paper. The AD/AE Committee is still going to evaluate your artifacts relative to your paper, but your explanation can help to set expectations up front, especially in cases that might frustrate the evaluators without prior notice. For example, tell the AD/AE Committee about difficulties they might encounter in using the artifact, or its maturity relative to the content of the paper.
Configuring software dependencies of artifacts can require a significant amount of time for reproducibility purposes. Therefore, to alleviate it, authors should consider one of the following methods to package the software components of their artifacts (although the AD/AE Committee is open to other reasonable formats as well):
There are several sources of good advice about preparing artifacts for evaluation:
During the Artifact Evaluation Stage, all the computational artifacts associated with the paper, such as software, datasets, or environment configuration required to reproduce the experiments are assessed. The goal of Artifact Evaluation is to award badges to artifacts of accepted papers. We base all badges on the NISO Reproducibility Badging and Definitions Standard. In 2023, the assigned badges will be per ACM Reproducibility Standard.
Authors of papers must choose to apply for a badge a priori during the AE phase. Authors can apply for one or more of the three kinds of badges that we offer. The badges available are Artifacts Available, Artifacts Evaluated-Functional, and Results Reproduced. Please, note that they are incremental: If one applies for Artifacts Evaluated Functional, this also includes Artifacts Available. If one applies for Results Reproduced, this also includes the other two badges. The type of badge and the criteria for each badge is explained next. To start the Reproducibility Evaluation Process, authors must provide links to their computational artifacts.
After the evaluation process, artifacts must be freezed to guarantee their persistence and immutability. An artifact must be accessible via a permanently persistent and publicly shareable DOI (What is a DOI? Check this out) on a hosting platform that supports persistent DOIs and versioning (for example, DataPort, Dryad, FigShare, Harvard Dataverse, or Zenodo). Authors should not provide links or zipped files hosted through personal webpages or shared collaboration platforms, such as Next Cloud, Google Drive, or Dropbox.
Zenodo and FigShare provide an integration with GitHub to automatically generate DOIs from Git tags. Therefore, it is possible to host code using version control provided by GitHub and describe the artifact using Zenodo or FigShare. Please, observe that Git itself (or any other control versioning software) does not generate a DOI, and it needs to be paired with Zenodo or FigShare.
The following are necessary to receive this badge:
Note that, for physical objects relevant to the research, the metadata about the object should be made available.
What do we mean by accessible? Artifacts used in the research (including data and code) are permanently archived in a public repository that assigns a global identifier and guarantees persistence, and are made available via standard open licenses that maximize artifact availability.
The criteria for the Artifacts Evaluated-Functional badge require an AD/AE committee member to agree whether the artifact provides enough details to exercise the artifact of components in the paper. For example, is it possible to compile the artifact, use a Makefile, or perform a small run? If the artifact runs on a large cluster—can it be compiled on a single machine? Can analysis be run on a small scale? Does the artifact describe the components to nurture future use of this artifact?
The reviewer will assess the details of the research artifact based on the following criteria:
We encourage authors to describe their (i) workflow underlying the paper, (ii) describing some of the black boxes, or a white box (e.g., source, configuration files, build environment), (iii) input data: either the process to generate the input data should be made available, or when the data is not generated, the actual data itself or a link to the data should be provided, (iv) environment (system configuration and initialization, scripts, workload, measurement protocol) used to produce the raw experimental data, and (v) the scripts needed to transform the raw data into the graphs included in the paper.
The evaluators successfully reproduced the key computational results using the author-created research objects, methods, code, and conditions of analysis. Note we do not aim to recreate the exact or identical results, especially hardware-based results. However, we do aim to:
Is participation in the badging process mandatory?
No. Participation in the badging process is voluntary. Please choose which badges you wish to apply for in the AD/AE Appendices Form. Artifact Evaluation will only occur for accepted papers who have applied for an appropriate badge. The badge will be assigned after the artifact evaluation process is over.
What is the set of badge labels that a paper can apply for?
Will SC host the artifacts of reproduced papers?
Authors are responsible to host their artifacts, whether reproduced or not. We suggest using one of the following platforms: Zenodo, FigShare, Dryad, Software Heritage for sharing their artifacts with the AD/AE Committee. The SC Reproducibility Initiative does not have any place to permanently host what we reproduce and/or review. So any work done to badge artifacts will not be hosted on a longer term.
SC has been a leader in tangible progress towards scientific rigor, through its pioneering practice of enhanced reproducibility of accepted papers.