Conquaire Continuous quality control for research data to ensure reproducibility


1 Project Information

Project title:
Conquaire (Continuous quality control for research data to ensure reproducibility)
Project start:
Project end:
Project Leader:
Prof. Dr. Philipp Cimiano
Data manager:
Cord Wiljes
Project members:
Vidya Ayer (AG Semantic Computing, CITEC), Christian Pietsch, Dr. Johanna Vompras (Bielefeld University Library)
External partners:
Biological Cybernetics – Prof. Dr. Volker Dürr
Central Lab Facilities – Prof. Dr. Sven Wachsmuth
Applied Computational Linguistics – Prof. Dr. David Schlangen
Neurobiology – Prof. Dr. Martin Egelhaaf
Neurocognitions and Action – Biomechanics – Prof. Dr. Thomas Schack
Neurocognitive Psychology – Prof. Dr. Werner Schneider
Atmospheric and Physical Chemistry – Prof. Dr. Thomas Koop
Economic Theory and Computational Economics – Prof. Dr. Sander van der Hoog
Emergentist Semantics – Prof. Dr. Katharina Rohlfing
Project Goal:
Improve the reproducibility of data analysis in scientific research

2 Training

As integral component of Conquaire, a series of workshops will be offered to instruct team members and project partners about all relevant topics regarding data management (e.g. best practices, documentation, versioned storage).

3 Existing Data

The project operates on research data produced in the nine partner projects. Management of this data is described in individual DMPs for each project.

The Conquaire technical infrastructure will be based on Open Source software, e.g. Git, and software frameworks.

4 Data created during the project

Conquaire will create the following data and software:

  • recordings (mp3 files) and transcriptions (text documents) of interviews with the partners
  • software code
  • documentation (stored in a GitLab wiki and word processing documents)

Open formats will be used wherever possible.

An overall amount of about 200 GB of data is expected (not including the partner projects’ data).

Data created by the partner projects will be described in separate DMPs for these projects.

5 Data Organization

All data and software created by Conquaire will be managed in a central GIT repository that manages version-controlled storage and allows access to all team members.

6 Documentation

Data and software will be described by useful metadata descriptions, most notably based on the Datacite metadata schema.

7 Data Sharing and Rights

Data publication is required by funding agency DFG. Data will be published on Bielefeld University’s institutional data repository PUB which offers publication, archiving and citation services. It is planned to release the raw data under an open license as soon as possible, i.e. during the project run. At the end of the project, all data and software will be published and available under open licenses.

In addition, the software will be published on CITEC's institutional software repository CITK – Cognitive Interaction Toolkit.

There are no legal barriers that prevent data publication. Whereas information about the partner projects is implicitly concerned, consent by the partners will first be contained. Interview data will be shared with interested parties upon request, after consent by the interviewed researchers.

8 Data Archiving

The data will be stored, according to DFG and Bielefeld University’s guidelines, for 10 years. Data will be archived on archival tape by Bielefeld University’s computing centre, following CITEC’s archiving policies. This involves primary and secondary data in all measured and calculated formats as well as the software necessary to acquire and calculate the data. This applies to data and software of Conquaire as well as to the data created in the partner projects.