Recommendations for the Implementation of Guidelines and Policies on Research Software Management at the Helmholtz Centers
Position paper by the Task Group Research Software of the Helmholtz Association Open Science Working Group and adopted by the Open Science Working Group on November 21, 2019.
- Position Paper: Recommendations for the Implementation of Guidelines and Policies on Research Software Management at the Helmholtz Centers
- Incentives and Metrics
- Software Development and Documentation Practice
- Strategies for the Making Available, Publication, and Transfer of Research Software
- Quality Assurance
- Licensing and Other Legal Issues
- Training and Continuing Professional Development
- Guidelines and Policies
As the digitalization of research and teaching progresses, the number of scientific software solutions used to process research data at research institutions is increasing. These software solutions are now indispensable in the process of knowledge production and for the transparency of the results.
In this context, the term “research software” refers in particular to program code (source code together with accompanying documentation, parameters, and workflows) that is developed and/or used in the context of a science-related activity. In what follows, source code will not be viewed in isolation. Rather, the focus will be on the entire life cycle of software projects, as the use of commercial software and existing open source software can also be of importance for scientific work.
The verifiability and reproducibility of scientific results called for under the heading “open science” can be ensured in many fields only if, in addition to the research data, the program code is also made openly accessible according to defined criteria. Program code is a key and stand-alone product of scientific work, and must be documented, published, and recognized in a similar way to other scientific methods. Together with open access to publications (open access) and open access to research data (open research data), access to and reuse of research software (open research software) are thus an essential element of open science.1
Research software at research institutions is therefore increasingly gaining importance and attention. At the same time, there is a lack of standards, guidelines and policies, best practices, and support mechanisms relating to the development, publication, and maintenance of research software. Researchers need support in managing research software across the entire software life cycle . One important aspect in this regard is the transparency of developments, modifications, and adaptations. Against this background, research software management should be organized in an open and transparent way.
In March 2017, the Helmholtz Association Open Science Working Group adopted the position paper “Access to and Reuse of Research Software.”2 In what follows, these positions will be addressed, supplemented, and given concrete form through practical recommendations for action for the Helmholtz Centers.3
This paper aims to create defined processes and to demonstrate options for action. However, it is also noted that the baseline situation varies depending on the Center and the research area. Each Center must therefore assess the importance of a recommendation, whether it can be implemented, and how it can be implemented. Some of the fields of action mentioned below lend themselves to cooperation between the Centers.
Incentives and Metrics
The intellectual activity performed when developing program code, and the contribution that this makes to a research result should be recognized and appreciated. The sustainable and open management of research software, the efficient deployment of resources, and the verifiability of research results can be guaranteed only if incentives for the corresponding practices to realize these goals exist.
In order to measure and recognize the work performed, the Helmholtz Centers are recommended to create incentives and to introduce adequate and transparent metrics for research software.
To recognize the scientific achievement and to anchor the appreciation of research software in the science system, the release of program code in the form of a publication should be promoted, as is also standard practice for scientific texts and research data. In dialogue with scholarly publishers, efforts should be made to require that the software used should be referenced in the articles. In this way, it can be ensured that the software developed or used becomes transparent for third parties and can be taken into account in scientometric analyses. To document the research output of a Center and its employees more comprehensively than has previously been the case, the Centers should, in addition to textual and data publications, also record software publications as a stand-alone publication type in publication databases, so that this type of publication can be taken into account in evaluations in the same way as other publication types.4
The development and use of program code, and its influence on the research work of third parties, should be actively traced via adequate mechanisms, taking into account the publication culture of the respective research area. Examples of metrics that allow conclusions to be drawn about the number and type of uses are citations of software in publications, download statistics, accesses to web-based software offerings, the quantity of feedback from users, the use of software repositories via forks and pull requests, and the integration of the program code into the software of third parties. Other suitable metrics could also include the number of mentions in project proposals, successfully acquired research funds, and revenue from commercialization activities. Aspects such as unique features, market leadership, and awards can also be included.
In addition to evaluations, the Helmholtz Centers can introduce measures to enhance the visibility and recognition of research software development, for example, in the form of prizes or awards. Moreover, information and advice services, training and continuing professional development (CPD) offerings, as well as corresponding career paths, provide incentives for employees to develop and maintain research software because of the importance attributed to it.5
Software Development and Documentation Practice
Sustainable software development goes hand in hand with good documentation practice. Both measures should be facilitated via appropriate infrastructures, imparted through suitable CPD offerings, and required by policies.
Analogous to lab journals for experimental work, general fundamental requirements for this can be defined in the rules of good scientific practice.6
By holding introductory events for new employees, these standards can be directly anchored in the Centers. When doing so, it should be expressly stated that, in order to make scientific work transparent, both modifications to program code (through version management) and the performance of individual analyses (lab journal) must be clearly documented. Moreover, the communication of these minimum standards can be used to draw attention to appropriate CPD measures7 and further recommendations. These further recommendations for development and documentation practice should provide concrete references to methods, software, and platforms that can be used for:
- software planning (use cases, class diagrams, UML, models, requirements documents, design documents, etc.)
- software development (versioning, issue tracking, code review, style guides, etc.)
- software testing (unit tests, integration tests, debugging, etc.)
- software distribution (clear versioning, simple installation, etc.)
- software documentation (code documentation, API documentation, user documentation, etc.)
Recommendations of this kind are not specific to Helmholtz and can therefore also be compiled and supplemented publicly and collaboratively as online recommendations.8
A public software repository managed by a player at the level of the Helmholtz Association would also be conceivable. In this context, a needs assessment on software and platform solutions could also be conducted.9
Strategies for the Making Available, Publication, and Transfer of Research Software
Alongside open access to textual publications and research data, open access to research software is a central building block of open science. Open access plays a vital role in the sustainability of research processes and promotes the enhanced perception of software development as a fundamental element of research. Only through the interplay of open access research software with textual publications and related data can the transparency of scientific results be ensured.
The Helmholtz Centers are therefore recommended to establish a defined process that supports the making available of, provision of open access to, and transfer of research software. Here, a distinction must be made not only between the publication and provision of software in the scientific context and a possible transfer of the software for commercial use, but also between small applications for individual analyses, extensive software libraries, and complex software systems.10
Where not precluded by commercial exploitation options or legal concerns, the aim of the Helmholtz Centers is to publish research software in whole or in part for free use in a trustworthy infrastructure. When developing the code, all exploitation options (open source, commercial use ...) should where possible be taken into account from the very beginning, so that all options are kept open for the future. Under certain conditions, appropriate embargo periods (e.g., binding regulations for publications or final theses or dissertations) and the preservation of competitive advantages are to be taken into consideration.
Collaborative access should be facilitated within the Center in order to promote synergies11 and interdisciplinary exchange. When deciding on the terms of access for third parties, account must be taken of legal parameters and the aims associated with providing open access to and transferring research software.12 In this context, it is important that the software be available in the long term and that each version be citable.13
The provision of Center-specific and cross-Center platforms for making software available in open access should be considered.14 Employees should be enabled to publish software using Digital Object Identifiers (DOIs) and other persistent identifiers (PIDs).15
The various players at the Centers – management, the legal department, technology transfer, the library, data center/chief information officer (CIO), and public relations – must be involved in the process. The process should be communicated in an understandable, transparent, open, and proactive way in order to provide guidance to employees who develop software and to inform them about possibilities of making research software available. Thus, when planning research projects, the making available, publication, and possible transfer of research software can be taken into account at an early stage, for example, within the framework of software management plans.16 This practice also enables better consideration and planning of the necessary resources.
If software is made available for reuse, legal parameters must be clarified.17 The way in which official releases and version statuses are to be handled and their provision can take place must be defined. Moreover, statements must be made about the scope of cost-free and fee-based services18 and the operation of infrastructures, and the resources needed to achieve this must be determined.
Suitable infrastructures can support employees in developing software, making it available, and preserving it in the long term. The Helmholtz Centers are therefore recommended to provide for this purpose platforms for collaborative software development and for making software permanently available. In this way, it can be ensured that the Centers retain autonomy over software developments and do not become dependent on commercial service providers. Care must be taken to ensure that the selected tools support agile software development, which is practiced especially in science.
Depending on needs and resources, these infrastructures can be administered by individual working groups, key facilities at the Helmholtz Centers, or cooperatively for the Helmholtz Association as a whole.
If such platforms are open access and usable free of charge, consideration should be given to partner or sponsor status in order to sustainably support open infrastructures and to illustrate the Helmholtz Association’s commitment to open science.
Moreover, research software that is complex and essential for a scientific community should be understood as infrastructure that is developed, maintained, and operated over many years. There are high demands on reliability, quality, and documentation, as well as on training and community building. Accordingly, the Helmholtz Association and its Centers must preserve research software on a long-term basis.
Recommended measures and framework conditions for the creation and introduction of suitable infrastructures and processes are:
- Supplementation of the publication policy and the publication database for recording and releasing software publications with reference to suitable storage infrastructures19
- Development of discipline-specific competence centers (core facilities) to support scientists in designing, implementing and optimizing software
- Provision of software development platforms that can be used for the training, the testing, and the start-up phase of new software projects20
- Provision of a preferably diverse and powerful hardware testing platform for the continuous integration of the developed versions21
- Archiving of the published releases in repositories and assignment of a Digital Object Identifier (DOI) or another persistent identifier (PID)
For some of these measures, it will be appropriate to develop a cross-Center solution for the Helmholtz Association in the medium term.
When providing infrastructures of their own, Centers should make sure that they have a sustainable human resources policy that ensures that they retain the know-how of employees for operating the infrastructures and supporting researchers in the long term.
The advancement of knowledge in science depends increasingly on research software. Thus, software has a significant influence on the quality of the research results achieved and on their reproducibility and verifiability. The Helmholtz Centers are therefore recommended to define and introduce adequate quality standards for the development and making available of research software.
Best practices of software engineering play a decisive role in compliance with software standards. When developing software, the application of tried-and-tested methods according to defined rules serves as a guide for employees in order to create software with a high standard of quality that enables transparency, reuse, and further development. Minimum standards for the development of research software should therefore be developed, used, and reviewed.
By using tools and infrastructures that stipulate processes and support staff in developing software, work processes in software development can be standardized and thus improved. The Helmholtz Centers should provide a corresponding range of tools and infrastructures and ensure that they are used.22
With the help of checklists, employees who develop software can independently check whether the program code developed complies with the quality requirements set. Depending on the type of research software, different compliance levels can be distinguished. Quality assurance can thus be conducted by means of defined quality criteria for different levels of quality.23
In addition, the Helmholtz Centers should create suitable infrastructures to establish processes for mutual support and reciprocal review as part of scientific work in interdisciplinary teams. The implementation of suitable review procedures should be promoted in order to review research software both from a scientific and from a technological perspective. In this way, it can be ensured that the research results achieved by means of the software are scientifically correct, and that third parties can use the software and understand and reproduce the results. Reviews should continuously accompany the development of software.
Review procedures when publishing research software are of particular importance.24 Procedures that lead to a cross-Center “software seal of approval” for published research software would be desirable.
The publication of research software can also contribute to quality assurance and sustainability.25 Standards for referencing and analyses of the relevance of a specific software should be observed and the corresponding metrics used.26 This also requires a corresponding quality of metadata. In addition to these technological and organizational measures, particular attention should be paid to the training and continuing professional development of employees.27
Licensing and Other Legal Issues
Because most software is protected by copyright and may also have commercially exploitable potential, it is important that, when making software available, a conscious decision be made regarding the type of use and thus the type of licensing. A distinction can be made between licensing with the aim of generating revenue through licensing fees by using traditional licenses, on the one hand, and making software more widely available by using open source licenses, on the other. By using an open source license, software is made available for reuse by third parties without a license fee (although even under such licenses, revenue may be generated, for example through support services or additional functions). The use of licenses that are open, established, and, for example, recognized by the Open Source Initiative (OSI) is recommended. In particular, the advantages and disadvantages of copyleft versus non-copyleft licenses must be weighed up.
The Helmholtz Centers are recommended to establish processes that ensure that the clarification of the copyrights, the exploitation possibilities, and the choice of a suitable licensing type for the respective software is guaranteed.
If it is intended to make the software available to the public, this should take place by means of standardized and established open source licenses, so that third parties are enabled to reuse the software in a legally secure way.
With recommendations, best practices, and clearly defined processes, the information infrastructures and the administration at the Helmholtz Centers should support researchers in licensing software. Here, the anchoring of the topic in the publication policies of the respective Centers can create bindingness.28 The Centers should create an information and advice service that specializes in software in science and, as required, enable the consultation of experts.29
Training and Continuing Professional Development
The quality of research software depends crucially on the knowledge and skills of the employees who develop it. The Helmholtz Centers are therefore recommended to address training and continuing professional development (CPD) procedures in dialogue with higher education institution partners, and to consciously establish training and career paths in the area of research software development.30
In addition to anchoring programming skills as early as possible in the training of specialist scientists, possibilities for training IT specialists for application development, and also dual/part-time study programs in computer science in cooperation with universities of applied sciences, should be evaluated. Moreover, Bachelor and Master theses and doctoral dissertations in which the development of research software plays a role can be produced in cooperation between the respective specialist scientific fields and computer science.
In the interests of sustainable human resources development, the strengthening of internal and external training and CPD measures should be supplemented with targeted ways of taking on excellent candidates after graduation.
In particular, the Helmholtz Graduate Schools, and also the Helmholtz Data Science Academy, can play a key role, for example, by offering courses31 in software development.32 In addition to internal courses at the Centers, which can be sustainably established by training multipliers, the flexible recognition of external offerings for doctoral candidates should be enabled.
In the case of cooperation between employees from specialist scientific fields and from computer science, their respective communication skills and their understanding of each other’s mindsets should be strengthened through suitable measures.
In practice, mutual cooperation between specialist scientific fields and computer science also leads to flexible shifts between the proportions of work done in software development and in the respective specialist scientific fields – a process that the Helmholtz Centers should support through targeted CPD offerings. To that end, software development courses at the above-mentioned Helmholtz Graduate Schools could be opened up to postdocs and other employees. By implication, however, ways of enabling computer scientists and other technical staff to attend specialist science courses should also be considered.
In addition, as employees who develop software are often spread across different work sites, the Helmholtz Centers should support the networking of these employees within and across the Centers and ensure that CPD offerings are coordinated. Here, new learning formats and methods can be promoted.33 Networks at the level of the Helmholtz research areas or the Helmholtz Association are also conceivable, for example, within the framework of the Helmholtz Academy.
Moreover, sustainability in software development should be promoted by the Centers through clear and long-term career prospects for developers. Here, the funding of permanent positions in different working groups is just as conceivable as dedicated or central teams of developers that are deployed flexibly, across Centers, and as required. These teams can work as a kind of “service platform” across working groups or even across Centers.34 In any case, the career paths of the individual developers should also be considered beyond the work at the Center (e.g., the opportunity to take software projects with them to new jobs,35 and the recognition of high-quality and sustainably usable research software as a scientific contribution).36
Guidelines and Policies
Guidelines and policies are the basis for the coordinated and organized management of research software. By taking into account basic procedures for managing software, providing guidance, and designating contact persons, they support and relieve persons who deal with research software. Corresponding guidelines and policies should cover the entire life cycle of research software and include, in particular, statements on the following areas: Incentives and Metrics; Software Development and Documentation Practice; Quality Assurance; Licensing and other Legal Aspects; Strategies for the Making Available, Publication, and Transfer of Research Software; Infrastructures and Archiving; Training and Continuing Professional Development.
The anchoring of guidelines and policies provides a reliable basis for the further development of joint processes and infrastructures, as well as for the organized management of research software in the digital age. The implementation should be supported by concrete templates and clear contact persons and should be proactively spread to all relevant areas of the Centers.
In addition to the formulation of guidelines and policies, Centers are recommended to develop concrete models for software management plans. These plans can support employees in planning and carrying out tasks in software development by citing options for managing research software and by creating a template for responsible software management. Particularly in the case of large collaborations, software management plans also help to create a reliable consensus on the management of the software.
The Helmholtz Centers are therefore recommended to establish panels of experts on the topic of research software with all relevant players from research, information infrastructure (libraries, data centers), knowledge and technology transfer, and legal departments. When doing so, all available competencies for the formulation and application of and compliance with guidelines and policies should be pooled. In particular, existing best practices and policies – for example, rules of good scientific practice, publication policies, and guidelines on technology transfer – must be taken into account, and possible areas of tension openly discussed. Only a coordinated and inter-referenced policy landscape enables the diverse challenges of the digital transformation in science to be met and the continuous adaptation of policies to be guaranteed. To successfully implement the guidelines and policies, in addition to providing infrastructures and templates, clear contact persons should be designated to provide support on questions and issues regarding the guidelines and policies and to pass on know-how.
1 See the self-conception of the Helmholtz Association Open Science Working Group at: https://os.helmholtz.de/open-science-in-der-helmholtz-gemeinschaft/akteure-und-ihre-rollen/arbeitskreis-open-science/selbstverstaendnis-des-arbeitskreises-open-science-der-helmholtz-gemeinschaft/
2 See: https://os.helmholtz.de/open-science-in-der-helmholtz-gemeinschaft/akteure-und-ihre-rollen/arbeitskreis-open-science/zugang-zu-und-nachnutzung-von-wissenschaftlicher-software/
In addition, a workshop report of the Research Software Task Group of the Open Science Working Group provides a comprehensive insight into the topic: Scheliga, K. S.; Pampel, H.; Bernstein, E.; Bruch, C.; zu Castell, W.; Diesmann, M.; Fritzsch, B.; Fuhrmann, J.; Haas, H.; Hammitzsch, M.; Lähnemann, D.; McHardy, A.; Konrad, U.; Scharnberg, G.; Schreiber, A.; Steglich, D. (2017): Helmholtz Open Science Workshop „Zugang zu und Nachnutzung von wissenschaftlicher Software“ #hgfos16. Report. Potsdam: Deutsches GeoForschungsZentrum GFZ. https://doi.org/10.2312/lis.17.01
3 These recommendations follow the “Recommendations for Policies of the Helmholtz Centers on Research Data Management” (https://doi.org/10.48440/os.helmholtz.036) formulated by the Open Science Working Group
4 See also the section “Strategies for the Making Available, Publication, and Transfer of Research
5 See also the section “Training and Continuing Professional Development.”
6 See also the section “Guidelines and Policies.”
7 See also the section “Training and Continuing Professional Development.”
8 It would also be conceivable, for example, to handle this topic within the framework of the priority
initiative “Digital Information” of the Alliance of Science Organizations in Germany.
9 See also the section “Infrastructures.”
10 See also the section “Licensing and Other Legal Issues.”
11 For example, by means of an efficient deployment of resources and the avoidance of redundant
12 See also the section “Licensing and Other Legal Issues.”
13 See also the section “Incentives and Metrics.”
14 See also the section “Infrastructures.”
15 Cross-referencing software on development platforms, in repositories, and journals using PIDs facilitates transparency. When doing so, the entire spectrum of software artifacts with program code (from source text to binaries), documentation, instructions, and metadata for indexing and searching can be taken into account. However, specific demands on quality assurance must be observed in each case. See also the section “Quality Assurance.”
16 See, for example, https://www.software.ac.uk/software-management-plans
17 See also the section “Licensing and Other Legal Issues.”
18 For example, user support, community building, and the support of developers.
19 See also the section “Licensing and Other Legal Issues.”
20 The platform should make development and project management tools available, and it should be possible to link it to external platforms (e.g., GitHub, SourceForge, or GNU Savannah). Possible development platforms include, for example, GitLab and Redmine.
21 Here, it is important that software projects are enabled to ensure the quality of the software by means of automated tests (unit tests and also tests to verify results). Development and test platforms can be made available as customized software containers for each project and preserved over a long period of time.
22 See also the section “Infrastructures.”
23 In this context, the concept of quality in relation to research software has many facets, for example, functionality, user friendliness, documentation, reliability, maintainability, security, portability, compatibility, and performance.
24 Here, the already mentioned checklists could be used. Cross-Center checklists would be desirable that satisfy the common demands of the Centers. See also the section “Strategies for the Making Available, Publication and Transfer of Research Software.”
25 See also the section “Strategies for the Making Available, Publication, and Transfer of Research Software.”
26 See also the section “Incentives and Metrics.”
27 See also the section “Training and Continuing Professional Development.
28 See also the section “Guidelines and Policies.”
29 See, for example, the consulting services at the German Aerospace Center (DLR).
30 For example, taking into account current developments on the topic of research software engineers (RSEs).
31 For example, best practices in software development as part of good scientific practice, programming languages, etc.
32 For example, in cooperation with higher education institution partners or other science-related providers and using open educational resources (OER).
33 For example, massive open online courses (MOOCs) and hackathons.
34 For example, the management of large software projects or complicated analysis pipelines.
35 See also the section “Licensing and Other Legal Issues.
36 See also the section “Incentives and Metrics."