A growing number of publishers and funding agencies require scientists to make their data available upon publication. Four foundational principles – Findability, Accessibility, Interoperability, and Reusability (FAIR) – support data producers and users by increasing added-value gained by contemporary, formal scholarly digital publishing. Data literacy and management are becoming basic skills for scientists.

The goal of the workshop is to debate how research and education, both funded heavily with public money, can accelerate their potential by being open (Open Access), transparent and largely processed in the public domain (Open Science). Practical examples such as the “reproducibility crisis” and retracted papers steering public opinion; bugs in proprietary data analysis software that compromise results will be given, together with an overview of current practical solutions to improve scientific practices (e.g. Project DEAL, FAIR data management, GOSH Roadmap). Consequently, open practices make scientific results and publications more reliable and reproducible.

Participants will collaborate on their own projects, while acquiring the non-digital and digital knowledge necessary to fulfil novel standards in data management and analysis reports. The workshop is particularly suited for groups aiming at fostering collaboration between their members; participants will explain their workflow and their data to their colleagues as part of the work.

We will conclude on how the adoption of Open Science practices not only brings benefits to the scientific community and society as whole, but also facilitates and optimizes individual workflows.

Research Data Management

Open Data is becoming a standard requested by funders, publishers and universities and Research Data Management (RDM) has been recognised as a core competencies for researchers. I have been developing workshops to teach RDM with a practical focus, as I am a former scientist with extensive experience in RDM and open data; and I would be happy to discuss the possibility to offer workshops through your graduate school.

I am proposing a constructivist approach (short theoretical introductions, examples treated in the whole group, and practical work in small groups on the management of personal research data). Students will be working as data specialists for their own project, and outsiders for the other projects. In addition to learn about RDM (data format, tidy spreadsheets, metadata, data organisation, backup and storage, data sharing, data citation), they will also have to make their data and projects understandable for the other members of their group, helping them to experience the importance of data documentation and metadata. The workshop is aimed at researchers working with long tail data, independently of their research focus.

As good RDM is a time saver on the long run, it would be most effective for researchers to follow such a workshop early in their career, and I hope we could help the next generation of researchers to produce better, sharable datasets.

Course content

Data management in a Reproducible Research Workflow (RRW)

From experimental design to publication
The art of the spreadsheet: csv ~~xlsx~~, tidy data, interoperability, machine and human readability
Metadata: experiment and sample wide, content, timing
Data inventory, folder organisation, file names, backup
Open & FAIR data: repositories, licences, FAIR principles

Reproducibility and data analysis

Version control & helper tools (git, Rstudio, Github)
How to combine data from different sources
Data modification, analysis documentation with Excel and R and Rstudio
Make your analysis human readable – code commenting: conventions and examples, dplyr package

Methodology

Our courses are geared towards adult learning and use participatory approaches. The trainer encourages participants to add their experience and knowledge to the course content. Topics covered are backed by real examples and relate to the participants’ field of research.

Before the course, participants can submit specific questions and their own presentation examples by email. The course content will be adjusted to the specific needs and requirements of the participants.

Participants are handed out reading material to be discussed during the course as well as a course summary with their achievements.

Course duration: 2 consecutive days (9am – 5pm)
Number of participants: 8-12
Trainer: Julien Colomb

Contact us for a custom proposal

The course can be aligned to your requirements regarding duration, form and content.

Reading Suggestions

Wilkinson, MD. et al., (Dec 2016), The FAIR Guiding Principles for scientific data management and stewardship. nature.com/scientificdata

datafairport.org

Access 2 Perspectives

FAIR Data Management

Research Data Management

Course content

Methodology

Contact us for a custom proposal