Potsdam, 28.-29. November 2016
S1: Improved data practices
It is generally accepted that our data practices, i.e. the way we deal with data, is neither efficient not cost-effective. Various initiatives as, e.g. Research Data Alliance, G8 and the FAIR groups, W3C consortium and others, aim at developing general principles, recommendations and technologies to improve data handling, and thus, prepare common data practices to cope with coming challenges of increased amount and complexity of data. In particular, automatic procedures to deal with flexibly defined collections will lead our way to future data handling. Following recent surveys, scientist spend more than 70% of their time retrieve data, finding ways to save them in some appropriate way, finding proper data formats, etc. Therefore, time is up opting for search for solutions which differ substantially. Data Management Plans provide a first step to stimulate reflection among all involved parties.
For this session, we are asking for contributions which provide alternative routes towards an improved way to deal with data. These routes should be of general relevance for a larger group of potential users and thus should be capable of being either generalized towards best practices, or being routed on existing ones.
Organiser: Peter Wittenburg
S2: Solutions between the poles of (data privacy) law and science
Combining data accross disciplines progressively dissolves the border between data from life and social sciences and technical/environmental data. As a consequence, legal aspects, in particular those regarding data privacy, data proprietary and utilization, are gaining importance in an increasing number of research projects. Indeed, in every case of collecting, processing, or archiving of personal or individual-related data, the principles of data privacy have to be followed. To be precise, legal requirements have to be guaranteed by both, appropriate organizational as well as technical measures. Therefore, questions of anonymization and data security play a central role, as well as rights of affected individuals or restrictions with respect to transfer data to third parties.
The session is aimed to address people who either perform research with individual-related data, or aim to combine such data sets, as well as data infrastructure providers. Therefore, we are calling for contributions illustrating legal principles of using research data in various disciplines and demonstrating practical consequences for the research process. Presentations of actual solutions or demonstrations of legal practices in data infrastructures are highly welcome.
Organiser: Claudia Oellers
S3: Training and Education
Data practices in academic research as well as in industry urgently need to be improved. Achieving this aim will not be possible without a new generation of data experts which need to be trained and educated. Recommendations for principles to be generally enforced and recommendations are ready to be incorporated in schemes for training and education. This trend will be enforced once several ongoing initiatives (W3C, OAI, RDA, FAIR, etc.) have systematically formulated further recommendations and specifications, which carry the potential to improve current practices and will be generally adopted. A particular question being raised within this context is the problem to efficiently address a wider community, i.e. to educate a multitude of trainers, being capable of efficiently propagate knowledge. Despite of the fact that notions and definitions are not yet internationally confined, two topically distinctive poles can yet be described: on the one side, there is a need to data scientists, carrying deeper knowledge on analytical methods (as, e.g., machine learning, stochastic methods). On the other end, data managers need to bring-in expertise on PID systems, metadata schemes, data formats, format transformations, data repositories, etc.. Currently, in most research laboratories data experts are expected to be confident in both areas.
We are therefore asking for contributions providing concrete suggestions on how to efficiently improve training and education of data experts.
Organisers: Heike Neuroth, Peter Wittenburg
S4: Data analysis
Centrally placed within the data life cycle, the topic data analysis bridges the gap between data acquisition, exploitation and storage on the one end and archiving and accessibility on the other end. The general hope to extract knowledge from a persistently increasing amount of data is based on methodologies separating relevant from irrelevant information within the flood of data, retrieving new aspects, which have not yet been visible at the time of planning data acquisition, and fusing various data sets in order to generate new perspectives. Hereby, technical aspects, e.g. coping with the pure amount of data, result, as well as new challenges dealing with data, e.g. data privacy. Transforming research towards data intensive sciences needs to avoid such issues becoming essential restrictions. Within this session, we aim to discuss aspects of data analysis in large, distributed datasets of varying levels of trust and privacy.
We are therefore looking for contributions, presenting new methodological approaches generating knowledge out of large amounts of data, performing analysis of distributed data sets, or coping with the challenges of modern data privacy through the use of techniques of privacy aware data mining. Hereby, presentations of case studies are welcome, as well as contributions introducing methodological approaches.
Organiser: Wolfgang zu Castell