State of the backend storage infrastructure is evolving

16 Nov 2020

The storage system bwSFS (Storage-for-Science) forms the geo-redundant distributed technical backbone for basic storage services, research data management and sharing of data. It contributes to the storage infrastructure for the DataPLANT community beside the de.NBI infrastructure services. The central storage components of bwSFS are located at the Tübingen and Freiburg computer centers. In order to reasonably manage the intended broad user base of the system - besides DataPLANT, the local communities of the participating universities and the Science Data Center BioDATEN are served as well - and to achieve a seamless integration into the envisioned DataPLANT services, a federated management of project, user and group data is necessary. Already in the implementation phase of the software and services, which involves the subject sciences, it becomes apparent that the existing methods for identity management are not sufficient. Compared to HPC services, storage services require a much deeper integration of existing infrastructures and a more flexible user management, which should also include ORCID, for example.

To support research data management, the use of InvenioRDM, which already includes a convenient user interface and the OAI-PMH interface, is chosen within bwSFS. All central institutions and projects involved in the RDM process were included in this decision at an early stage. In Freiburg, a Gitlab for versioning, collaboration and sharing of data of ongoing projects will be used in the context of TA 4. Established services of the university libraries will be used for DOI allocation in Invenio, and ORCID will be used for persistent identification of researchers. In this way, resources for RDM will be pooled in order to improve support for the disciplines in the implementation of specific RDM requirements and to ensure better advice for researchers. To enforce the FAIR and OpenAccess principles, DMPs will be used to support standardization in TA 1 with guidelines for metadata management, archiving and licensing models.