Annotated Research Context

ARCs are FAIR digital Objects (FDOs)

As such, they come along with metadata, code for operations, and a persistent identifier. ARCs are compliant to the FAIR Principles since they are​

Real-world ARC

An ARC is intended to capture complete (meta)data, including raw and processed data, but also workflow descriptions as well as essential external files. ARCs cover scenarios ranging from single experimental setups to complex experimental designs. The full ARC-specifications are provided here.

ARCs are RO-Crate certified

ARCs are a profile implementation of RO-Crates for basic plant research.

The ARC is packaged with its digital object identifier and its RO Metadata File (ro-crate-metadata.json).

This RO metadata file is automatically generated using a converter, which is able to read and transport information out of the ARC.

Tools for creating ISA model coherent ARCs

Although ARCs can be generated completely manually, we encourage you to use our convenient tools that assist you in the process.
text-top-image-bottom The ARC Commander offers machine-aided creation and completion of the ARC folder and file structure.

SWATE and our templates assists you in the process of creating your assay files.

ARCs offer a single point of entry logic for data management and computation

ISA investigation file

A mandatory central registry for studies, assays, persons, … saved in a XLSX format, which follows the ISA model specification (v1.0).

Using the ARC Commander allows you to automatically fill and update the investigation file.

Every worksheet needs to contain one table object storing the metadata. Comments or additional information can be stored alongside with table objects in a worksheet.

Isa assay file

Assay metadata must be annotated in the file isa.assay.xlsx at the root of the assay's subdirectory. This workbook must contain a single assay organized in one or many worksheets. A worksheet named “assay” must store the STUDY ASSAYS section of the ISA model and is not required in the isa.investigation.xlsx. Additional worksheets must contain a table object organized on a per-row basis with the first row as column headers. Table objects must contain at least one source. Source sample relations must follow a unique path in a directed acyclic graph. Sources must be indicated by the column header Source Name, Samples accordingly by the header Sample Name.

Assays

Each assay is a collection of files stored in a single directory, including a mandatory metadata file in ISA-XLSX format.

Assay data files as well as protocols must be placed in a subdirectory individually.

We advise you to use SWATE for completing the metadata files, as it comes along with a broad range of prepared templates and as it assists you in following the annotation principles.

Externals

External data or knowledge files refer to data that is neither originating within the investigation/study scope of the ARC nor can be referenced externally. Yet, these files are required to ensure reproducibility. Examples include early use of ontologies, mapping files, gene information, and so on. All corresponding files need to be stored in the ARC’s top-level externals directory.

Together these external data form a virtual assay. To follow the metadata requirements of assays, a metadata file in ISA-XLSX format is mandatory for each external file, located under the top-level externals directory.

Workflows

Workflows in ARCs represent processing steps used in computational analyses and other data transformations of an ARC's assays to generate run results. They are a collection of files stored in a single directory under the top-level workflows subdirectory.

A per-workflow executable CWL description (v1.2 or higher) is stored in a mandatory workflow.cwl. This can either contain a CWL tool or workflow description.

We highly recommend to include a reproducible execution environment description in form of a Docker container description for tool descriptions.

Runs

Runs in an ARC represent all artefacts that derive from computations on assay and external data. Plots, tables, or similar results need to be saved in one or more subdirectories of the top-level runs directory.

A run.cwl (v1.2 or higher) is mandatory for each of these subdirectories to cover workflow description and reproducibility. These files need to be executable without additional payload files or files outside the ARC.

A parameter file run.yml is optional to specify input parameters.