DataPLANT extends the Call-for-ARC to the wider community

The Annotated Research Context (ARC) is a conceptual combination of experimental data with associated annotation, metadata and description of computational workflows. It aims at structuring the workflows on the local desktop of the user, allows for sharing and versioning and should finally be the base of a data publication. The ARC covers the complete research cycle and includes experimental data described by the ISA model as well as reference knowledge and computational steps like software, code and scripts as information for the core data repositories on transcriptomics, proteomics, metabolomics and phenomics. In DataPLANT the ARCs will be the generic interface for compute and storage as well as publication and sharing.

DataPLANT: First General Assembly

On 21st September the DataPLANT participants gathered online to officially meet for the First General Assembly of the consortium. More then 40 people participated and discussed in the two hours session. At the beginning the DataPLANT mission got refined and explained and how it fits into the NFDI general objectives. The NFDI intends to build a national, sustainable data infrastructure for disciplinary sciences. All successful consortia will enjoy a funding for five years initially. After a successful interim evaluation it would be possible to extend the activities for another five years. DataPLANT intends to gain a significant coverage of researchers and groups in the field of fundamental plant research and to establish a sustainable funding scheme during this time. The NFDI consortia should help, advise and support “their disciplines” in standardization and research data management. To achieve that objective DataPLANT puts the so called Annotated Research Contexts (ARC) at its core. An ARC consists of an ordered combination of experimental data with annotation and descriptions of computational workflows. The ARCs will help to increase the level of annotation at the source and track provenance using community standards to maximize future data discoverability and reuse. With it will come new responsibilities for researchers. DataPLANT will homogenize formats, terminologies, guidelines, policies to simplify the RDM landscape which will lead to democratization of research data.

To support the handling of ARCs storage, compute and authentication services will get integrated to facilitate usability and access. The services will focus on plant specific processing and analysis tools which cover the complete research cycle. ARCs will become the generic interface for compute and storage as well as for publication and sharing. The community will especially profit through specialized tools and workflows, more available storage, a full ARCs compatibility, automated metadata generation and public repository compatibility. These developments drive the digital change in science with the goal to establish an ARC publication as a full equivalent to a classical (paper) publication.

ARCs are the base of the collaborative research platform and can be shared between researchers seamlessly - through a mixture of locally decentralized storage mixed with a collaborative centralized versioning. The ARCs become the easy to use single access point for the Swate tool which uses an annotation grammar of source names, characteristics, factors, parameters, sample name and data.

The DataPLANT consortium is the representative of fundamental plant research within the NFDI. To facilitate a tight interaction with the community three boards got created and a General Assembly will gather from time to time. In between GA the boards will guide and advise DataPLANT. For transparency all meeting documents will be published (slides, notes, decisions, …). The office will support the future board procedures and will be the first point of contact for the data stewards, the community and researchers interested in the consortium. Four task areas (TA) will work in different means to support the community: TA1 will hire five standards experts which will work on quality, standardization and interoperability. TA2 plans to deploy six developers and technicians to work on software, services and infrastructure for the community. TA3 hosts seven data stewards which will directly interact with the data champions and the wider community on transfer, application and education in the field. These TAs will get complemented by four people in coordination, management, support and services in TA4. Those plannings are estimates as DataPLANT got a nominal budget cut of 13.5% which translates in real terms (as the salaries are fixed to 2019 DFG figures) to 20 - 25% reduction. Further on the agreement of the applicant institution (University of Freiburg) with the DFG is about getting to be signed, the consortial agreement is still in circulation.

To actually start in DataPLANT a first round of personnel is getting to be hired. The data champions will be contacted by the office in the coming weeks to refine the vision and prioritize requirements. They are the primary target for the “Call for ARCs” in the first round. This is meant to kick-off the data stewards service which is intended to communicate and realize the FAIR best practices, plan data management strategies for research projects and proposals and provide legal assistance for licensing and sharing of data.

The general assembly got concluded by a talk of Prof. York Sure-Vetter, director of the NFDI. He gave a short overview on the state of the creation of the registered association and it’s inauguration in October. He motived the common goal of the improvement of research data management, the advancement of science and the change of academic culture to embrace open data publications.

DataPLANT governance: First round of board meetings held

During the beginning of September board meetings were held, starting with the senior management board followed both by the scientific and technical boards. The first round of meetings got chaired by the speaker, future meetings will be coordinated by a board chair supported by the office. It was decided to make the relevant information transparent; thus the slides and minutes are directly available via the linked PDFs. All information presented and decisions made will be made public in the future as well, at least to all participants of DataPLANT. Until DataPLANT is officially started, the new web site is becoming available and the cooperation and participation tools are decided upon, the documents will reside on the preliminary web site.

The senior management board directly reports to the General Assembly and represents the community in between GA meetings. It oversees all TA activities (with a focus on TA4) and finances and takes care of the strategy development. The Technical Board employs technical and infrastructural experts and moderates the various infrastructure requirements and data service offerings and decides on resource distribution/allocation, future development of new offerings or the deprovisioning of deprecated services. It will receive input mainly from TA2 and TA3. The scientific board consists mostly of plant scientists and computer scientists developing services and advises on standardization needs and develops foresight processes. It focuses on the TA1 and TA3. The boards take the input from both the general assembly and the senior management board and outside requests (e.g. from other consortia or the NFDI super structure). All boards will be supported by the DataPLANT office; the post will be offered soon through the usual channels.

Both technical and scientific boards moderate requests, which were not decided within the respective body to the high-level decision-making committee, the senior management board. The senior management board consists of all co-speakers of DataPLANT and reports to the general assembly. Whilst the senior management board strives to make unanimous decisions, it will employ a simple majority vote principle where in case of a tie vote the coordinator decides. The office handles the everyday business on behalf of the management board and the disbursement of funds of the sub NFDI. It is the first point of contact for the general NFDI processes, the facilitation of the inter NFDI exchange and the DataPLANT community (both participants and new users). It acts on behalf of the management board in between meetings and buffers incoming requests. The office takes care of the financial affairs of DataPLANT and reports annually to the general assembly and back to the grant provider. Furthermore, it handles inter-NFDI affairs and supports the coordination of cross cutting topics. The DataPLANT governance and coordination is designed to be open for adaptation; the NFDI governance - which will get fixed through the association registration process in October - was identified as a joint topic relevant for all consortia. The suggested structures will be put into a regular reviewing scheme by the general assembly and the senior management board. Both inputs from the DataPLANT community as well as from the NFDI in general will be considered.