Reply to the reviewer panel's protocol sent to the DFG

28 Feb 2020

Today the DataPLANT consortium sent the three pages reply to the reviewers remark and used the chance to clarify a couple of open questions. We got a quite positive feedback on our concept of data stewards as a central support element to the fundamental plant research community. This core element makes the training and recruitment of suitable people a central task of the DataPLANT consortium. The successful training of the Data Stewards essentially requires basic skills in the standardized handling of data, their modeling and analysis, as well as a solid domain understanding of quantitative plant biology, especially the methods used in this research field. In addition, practical skills in handling different software packages, analysis and workflow tools as well as in teaching and training are required. To ensure the training of the Data Stewards from the start of DataPLANT, we propose a strategy with a mixture of direct and long-term measures. Corresponding training and further education opportunities are already included in the curricula of the applicant institutions. It is also planned to make existing courses of study (e.g. in the field of data science) more flexible so that the basic training of data stewards can be provided within the framework of these courses of study. A new development of tailor-made courses of study is also conceivable in the long term. With regard to both measures, the applicants were able to secure the support of the partner universities in preliminary discussions. For example, the course of study Quantitative Plant Biology at CEPLAS is a valuable addition to the recruitment of personnel.

The review panel rightfully suggests that the services to be developed in DataPLANT should be flexible enough to ensure sustainability. A central goal in the design of the DataPLANT Hub is to provide flexible, demand-oriented and sustainable support for research data management throughout the entire research cycle. An essential aspect of the hub is the abstraction of workflows using the Common Workflow Language (CWL), which is widely used in all scientific fields, in order to enable researchers to execute workflows on any platform and to avoid the mentioned dependencies. The intended annotation of workflows with metadata as well as the quality measures for workflows to be developed act as additional abstraction mechanisms. Currently, GALAXY is a mature and widely used workflow platform on which the DataPLANT Hub and other measures of the consortium can be based at an early stage. Nevertheless, a sole dependency on GALAXY is actively avoided by abstraction and open creation of the DataPLANT Hub.

DataPLANT aims to mirror the fundamental plant research community across all status groups in research and provider institutions. In order to ensure diversity, DataPLANT will implement the following multi-track strategy: 1) continuous recruitment of new participants and active recruitment of suitable individuals with a special focus on diversity from the growing plant community and beyond; 2) Exemplary role in the design of the initial governance: The governance will broadly staffed with members from the group of participants with respect to diversity, especially regarding career level, reputation and gender balance. We are also able to involve scientifically very experienced and proven personalities whose experience with clusters of excellence DataPLANT will benefit from. 3) Development of promotion and qualification opportunities (especially in the context of the training of Data Stewards) that allow participants at early career stages to become involved in DataPLANT at an early stage and to take on responsibility. 4) Active use of the explicit diversity measures of the participating institutions.

Similar to the review panel, we consider the operating and financing model of DataPLANT to be a major challenge for long-term sustainability. While initially the services are offered free of charge for participants through project financing, for a permanent offer there is the necessity to put needs and arising costs for the refinancing of resources into relation. Supporting individual researchers and research associations in applying for FDM funding is a key aspect of DataPLANT. Our vision is that the mechanisms developed in DataPLANT will manifest themselves sustainably in the community. DataPLANT brings its ideas and efforts to the overarching NFDI structure (e.g., via the cross cutting topics) and is itself open to being integrated into the emerging frameworks of the NFDI and its approaches. We are in continuous exchange with other consortia on these topics. First approaches to solutions already result from the integration of the providers into collaborative national, national and international infrastructures.