Tools and Services to structure the Data Jungle for plant researchers

As a simple structural scaffold that should greatly facilitate efforts of proper research data management, the Annotated Research Context is introduced, which intrinsically follows FAIR requirements. But all beginnings are difficult, especially when looking at a blank screen. Dedicated to help handling the Annotated Research Context, the DataPLANT consortium started its operation by assembling a suite of tools.

Metadata Tool Chain

In respect to RDM, it seems natural to consider the data as the center and build the DataPLANT tool chain around it. This results in the technical realization and standardized RDM procedures being process-oriented, meaning that each tool realizes or supports the researcher in a distinct task within the RDM cycle. Consequently, this enables the desired mixed mode of application, in which both human and machine can operate processes simultaneously or asynchronously. In addition, we thereby avoid technological barriers and embrace open software and open science, respectively.

Templates for convenient metadata annotation

To overcome annotation nuisance, DataPLANT provides a growing collection of templates designed and curated by data stewards that cover the submission routine to selected end-point repositories. The template design process is initiated “backwards”, starting from the requirements of end-point repositories and thereby compliance with metadata standards. Data stewards supervise the implementation of ontologies and the use of controlled vocabulary as required by the target repository, and simultaneously contribute to the development of the DataPLANT broker ontology.

SWATE - Swate Workflow Annotation Tool for Excel

Swate is an add-in that allows the user to easily annotate their data according to ISA standards. It focuses on providing an easy-to-use tool in the widely used and thus familiar environment of the most used spreadsheet editor. In order to directly incorporate collaborative mechanisms, the tool is implemented both as a modern web-based online application and as a desktop application. Fully integrated in Microsoft Excel, users can add and delete columns with specialized headers describing their data in a clear representation. By design, Swate facilitates ontology driven development of data annotation schemes by the domain experts performing the actual research data generation. Swate features a search function for ontological terms, facilitating an ontology-driven annotation of the data. It can insert ISA-conform protocols and processes that support the DataPLANT template mechanism.

ARC Commander

As a starting point, the ArcCommander simplifies the generation of the ARC folder and file structure. Following the ISA model, this includes a central registry in the form of an automatically updated investigation file. Although the requirements are minimal, assisting the researcher in these repetitive tasks and providing structured guidance is beneficial and reduces friction and workload on the side of the user. This is the main aim of the tool ArcCommander. Essentially, this tool provides automation and assistance with processes following the ARC specification. The ArcCommander can be executed to initialize an ARC, creating the basic folder structure, and setting the working environment. Additionally, it can be used to create and modify sub-branches of the ARC, such as assays and workflows.

Swobup and SwateDB, a team for metadata broker to bridge the ontology gap

Experience has taught us that missing or unsuitable ontology terms and relations lead to a setback of user motivation and participation. DataPLANT provides a dedicated NFDI4plants ontology (to act as a broker and) bridge between the individual researcher and main ontology provider. This ontology enables the collection of missing vocabulary for immediate use and is also stored in the SwateDB. The automatic process handler Swobup (Swate OBO Updater) simplifies this ingestion process of adding or removing (one or more) terms from an ontology and synchronizes ontology terms in OBO format and publicly hosted metadata templates either by pull request mechanism or a group of authorized users. File versioning and adaptations to the database are outsourced to a shared repository. Any change can be reverted using the repository’s built-in features.

