Packaging research artefacts with RO-Crate

26 Aug 2021

In DataPLANT a working group formed to define a specification of an outline and packaging of research data and its context. This group worked on the specification of the Annotated Research Context (ARC) which will get finalized for the version 1.0 at the upcoming ARC hackathon beginning of September. This aligns well with the proclaimed objectives of DataPLANT regarding to the goals of research data management (RDM). The ARC builds on and implements existing standards like ISA for administrative and experimental metadata and CWL for analysis and workflow metadata. They are designed to represent digital objects that fulfill all FAIR principles and are therefore referred to as FAIR Digital Objects (FDO).

For its internal structure the ARC is modeled to fulfill the requirements of the Research Object Crate (RO-Crate), an open, community-driven, and lightweight approach to packaging research artefacts along with their necessary metadata and reproduction context in a machine readable manner. The RO-Crate is based on Schema.org annotations in JSON-LD. It primarily aims at establishing best practices to formally describe metadata in an accessible and practical way for their use for a large number of scientific communities.

Following a similar concept like the ARC RO-Crate is a structured archive of all the items that contributed to research results, including their provenance, relations, annotations and various identifiers for data, tools and workflows. RO-Crate is meant to be discipline agnostic and thus being used across multiple areas, including bioinformatics, digital humanities and further domains. Colleagues active in DataPLANT contributed to a paper on RO-Crate "Packaging research artefacts with RO-Crate" which just got published as a workshop outcome at Zenodo.