Greater Expectations? Writing and supporting Data Management Plans

“A blueprint for what you’re going to do”

This series of videos was arranged before I joined the Research Data Service team, otherwise I’d no doubt have had plenty to say myself on a range of data-related topics! But the release today of this video – “How making a Data Management Plan can help you” – provides an opportunity to offer a few thoughts and reflections on the purpose and benefits of data management planning (DMP), along with the support that we offer here at Edinburgh.

YouTube Preview Image

“Win that funding”

We have started to hear anecdotal tales of projects being denied funding due – in part at least – to inadequate or inappropriate data management plans. While these stories remain relatively rare, the direction of travel is clear: we are moving towards greater expectations, more scrutiny, and ultimately into the risk of incurring sanctions for failure to manage and share data in line with funder policies and community standards: as Niamh Moore puts it, various stakeholders are paying “much more attention to data management”. From the researcher’s point of view this ‘new normal’ is a significant change, requiring a transition that we should not underestimate. The Research Data Service exists to support researchers in normalising research data management (RDM) and embedding it as a core scholarly norm and competency, developing skills and awareness and building broader comfort zones, helping them adjust to these new expectations.

“Put the time in…”

My colleague Robin Rice mentions the various types of data management planning support available to Edinburgh’s research community, citing the online self-directed MANTRA training module, our tailored version of the DCC’s DMPonline tool, and bespoke support from experienced staff. Each of these requires an investment of time. MANTRA requires the researcher to take time to work through it, and took the team a considerable amount of time to produce in order to provide the researcher with a concise and yet wide-ranging grounding in the major constituent strands of RDM.  DMPonline took hundreds and probably thousands of hours of developer time and input from a broad range of stakeholders to reach its current levels of stability and maturity and esteem. This investment has resulted in a tool that makes the process of creating a data management plan much more straightforward for researchers. PhD student Lis is quick to note the direct support that she was able to draw upon from the Research Data Service staff at the University, citing quick response times, fluent communication, and ongoing support as the plan evolves and responds to change. Each of these are examples of spending time to save time, not quite Dusty Springfield’s “taking time to make time”, but not a million miles away.

There is a cost to all of this, of course, and we should be under no illusions that we are fortunate at the University of Edinburgh to be in a position to provide and make use of this level of tailored service, and we are working towards a goal of RDM related costs being stably funded to the greatest degree possible, through a combination of project funding and sustained core budget.

“You may not have thought of everything”

Plans are not set in stone. They can, and indeed should, be kept updated in order to reflect reality, and the Horizon 2020 guidelines state that DMPs should be updated “as the implementation of the project progresses and when significant changes occur”, e.g. new data; changes in consortium policies (e.g. new innovation potential, decision to file for a patent); changes in consortium composition and external factors (such as new consortium members joining or old members leaving).

Essentially, data management planning provides a framework for thinking things through (Niamh uses the term “a series of prompts”, and Lis “a structure”. As Robin says, you won’t necessarily think of everything beforehand – a plan is a living document which will change over time – but the important things is to document and explain the decisions that are taken in order for others (and your future self is among these others!) to understand your work. A good approach that I’ve seen first-hand while reviewing DMPs for the European Commission is to leave place markers to identify deferred decisions, so that these details are not forgotten about (This is also a good reason for using a template – a empty heading means an issue that has not yet been addressed, whereas it’s deceptively easy to read free text DMPs and get the sense that everything is in good shape, only to find on more rigorous inspection that important information is missing, or that some responses are ambiguous.)

“Cutting and pasting”

It has often been said that plans are less important than the process of planning, and I’ve been historically resistant to sharing plans for “benchmarking” which is often just another word for copying. However Robin is right to point out that there are some circumstances where copying and pasting boilerplate text makes sense, for example when referring to standard processes or services, where it makes no sense – and indeed can in some cases be unnecessarily risky – to duplicate effort or reinvent the wheel. That said, I would still generally urge researchers to resist the temptation to do too much benchmarking. By all means use standards and cite norms, but also think things through for yourself (and in conjunction with your colleagues, project partners, support staff and other stakeholders etc) – and take time to communicate with your contemporaries and the future via your data management plan… or record?

“The structure and everything”

Because data management plans are increasingly seen as part of the broader scholarly record, it’s worth concluding with some thoughts on how all of this hangs together. Just as Open Science depends on a variety of Open Things, including publications, data and code, the documentation that enables us to understand it also has multiple strands. Robin talks about the relationship between data management and consent, and as a reviewer it is certainly reassuring to see sample consent agreement forms when assessing data management plans, but other plans and records are also relevant, such as Data Protection Impact Assessments, Software Management Plans and other outputs management processes and products. Ultimately the ideal (and perhaps idealistic) picture is of an interlinked, robust, holistic and transparent record documenting and evidencing all aspects of the research process, explaining rights and supporting re-use, all in the overall service of long-lasting, demonstrably rigorous, highest-quality scholarship.

Martin Donnelly
Research Data Support Manager
Library and University Collections
University of Edinburgh

Share

Research Data Service use cases – videos and more

Earlier this year, the Research Data Service team set out to interview some of our users to learn about how they manage their data, the challenges they face, and what they’d like to see from our service. We engaged a PhD student, Clarissa, who successfully carried out this survey and compiled use cases from the responses. We also engaged the University of Edinburgh Communications team to film and edit some of the user interviews in order to produce educational and promotional videos. We are now delighted to launch the first of these videos here.

YouTube Preview Image

In this case study video, Dr Bert Remijsen speaks about his successful experience archiving and sharing his Linguistics research data through Edinburgh DataShare, and seeing people from all corners of the world making use of the data in “unforeseeable” ways.

Over the coming weeks we will release the written case studies for internal users, and we will make the other videos also available on Media Hopper and YouTube. These will address topics including data management planning, archiving and sharing data, and adapting practices around personal data for GDPR compliance and training in Research Data Management. Staff and users will talk about the guidance and solutions provided by the Research Data Service for openly sharing data – and conversely restricting access to sensitive data – as well as supporting researchers in producing meaningful and useful Data Management Plans.

The team is also continuing to analyse the valuable input from our participants, and we are working towards implementing some of the helpful ideas they have kindly contributed.

Share

An internship in the Research Data Service: Towards tailored Research Data Support

For four weeks in July and August 2018, I did an internship in the Research Data Support (RDS) of the University of Edinburgh’s Information Services (IS). Otherwise, I am working as a librarian trainee in Bern University Library in Switzerland. There, as well as in other parts of Europe, research data is an issue which constantly gains momentum, and libraries are, among others, at the forefront of the changing scene. IS has a very good reputation for their work in this field, and so, as a librarian to be, the internship in the RDS was an outstanding opportunity for me to get first hand insights and experiences.

The project I was working on was about tailoring guidance for researchers writing their Data Management Plans (DMP) with the tool dmponline. As a basis for this, I had to gather information about the practices and needs of academic and support staff around research data management (RDM) and DMP. I was to work with staff from all three colleges. (In fact, I found that my project had quite some similarities to Clarissa’s who was just finishing her project when I joined the team.)

My first step was to get in touch with the school support staff, which was essential to get an overall impression of how RDM worked in each school, and to arrange my contacts with researchers. From this, along with information gathered from each schools’ websites, I created an interview questionnaire as well as an online survey. These served to capture researchers’ and support staff’s experience with RDM. For me, conducting interviews was a new and valuable experience. I gained confidence, and I was inspired by the staff’s willingness to share their experience with RDM. I think that interviewing is a very useful skill to develop, because finding out what school staff think and what they need is important in almost every sector of library work.

From the interviews and surveys, I also learnt a lot about researchers’ different practices and challenges in the context of research data management. I analysed the responses and documented my findings in reports for IS and school support staff. Unfortunately, my internship was too short for me to complete the tailored guidance part of the project, but I hope that my work will serve as a basis for the teams’ endeavours to further adapt their DMP support.

Summing everything up, my internship was an inspiring experience which was at times intense but also hugely enriching. This was due in large part to the fantastic team who were welcoming and supported me most effectively whenever needed (this is true, too, for my contact persons in the schools). I would have loved to learn even more about their various experiences, but, after all, I am really grateful for the opportunity I have been given to participate in their work and to learn so much about RDM.

Gero Schreier
Research Data Service Project Assistant
Librarian in training, University Library, University of Bern (Switzerland)

Share

Fostering open science in social science

FOSTER_logoOn 10th of June, the Data Library team ran two workshops in association with the EU Horizon 2020 project, FOSTER (Facilitate Open Science Training for European Research), and the Scottish Graduate School of Social Science.

The aim of the morning workshop, “Good practice in data management & data sharing with social research,” was to provide new entrants into the Scottish Graduate School of Social Science with a grounding in research data management using our online interactive training resource MANTRA, which covers good practice in data management and issues associated with data sharing.

The morning started with a brief presentation by Robin Rice on ‘open science’ and its meaning for the social sciences. Pauline Ward then demonstrated the importance of data management plans to ensure work is safeguarded and that data sharing is made possible. I introduced MANTRA briefly, and then Laine Ruus assigned different MANTRA units to participants and asked them to briefly go through the units and extract one or two key messages and report back to the rest of the group. After the coffee break we had another presentation on ethics, informed consent and the barriers for sharing, and we finished the morning session with a ‘Do’s and Dont’s exercise where we asked participants to write in post-it notes the things they remembered, the things they were taking with them from the workshop: green for things they should DO, and pink for those they should NOT. Here are some of the points the learners posted:

DO
– consider your usernames & passwords
– read the Data Protection Act
– check funder/institution regulations/policies
– obtain informed consent
– design a clear consent form
– give participants info about the research
– inform participants of how we will manage data
– confidentiality
– label your data with enough info to retrieve it in future
– develop a data management plan
– follow the certain policies when you re-use dataset[s] created by others
– have a clear data storage plan
– think about how & how long you will store your data
– store data in at least 3 places, in at least 2 separate locations
– backup!
– consider how/where you back up your data
– delete or archive old versions
– data preservation
– keep your data safe and secure with the help of facilities of fund bodies or university
– think about sharing
– consider sharing at all stages. Think about who will use my data next
– share data (responsibly)

DON’T
– unclear informed consent
– a sense of forcing participants to be part of research
– do not store sensitive information unless necessary
– don’t staple consent forms to de-identified data records/store them together
– take information security for granted
– assume all software will be able to handle your data
– don’t assume you will remember stuff. Document your data
– assume people understand
– disclose participants’ identity
– leave computer on
– share confidential data
– leave your laptop on the bus!
– leave your laptop on the train!
– leave your files on a train!
– don’t forget it is not just my data, it is public data
– forget to future proof

Robin Rice presenting at FOSTERing Open Science workshop

Our message was that open science will thrive when researchers:

  • organise and version their data files effectively,
  • provide comprehensive and sufficient documentation for others to understand and replicate results and thus cite the source properly
  • know how to store and transport your data safely and securely (ensuring backup and encryption)
  • understand legal and ethical requirements for managing data about human subjects
  • Recognise the importance of good research data management practice in your own context

The afternoon workshop on “Overcoming obstacles to sharing data about human subjects” built on one of the main themes introduced in the morning, with a large overlap of attendees. The ethical and regulatory issues in this area can appear daunting. However, data created from research with human subjects are valuable, and therefore are worth sharing for all the same reasons as other research data (impact, transparency, validation etc). So it was heartening to find ourselves working with a group of mostly new PhD students, keen to find ways to anonymise, aggregate, or otherwise transform their data appropriately to allow sharing.

Robin Rice introduced the Data Protection Act, as it relates to research with human subjects, and ethical considerations. Naturally, we directed our participants to MANTRA, which has detailed information on the ethical and practical issues, with specific modules on “Data protection, rights & access” and “Sharing, preservation & licensing”. Of course not all data are suitable for sharing, and there are risks to be considered.

In many cases, data can be anonymised effectively, to allow the data to be shared. Richard Welpton from the UK Data Archive shared practical information on anonymisation approaches and tools for ‘statistical disclosure control’, recommending sdcMicroGUI (a graphical interface for carrying out anonymisation techniques, which is an R package, but should require no knowledge of the R language).

DrNiamhMooreFinally Dr Niamh Moore from University of Edinburgh shared her experiences of sharing qualitative data. She spoke about the need to respect the wishes of subjects, her research gathering oral history, and the enthusiasm of many of her human subjects to be named in her research outputs, in a sense to own their own story, their own words.

Links:

Rocio von Jungenfeld & Pauline Ward
EDINA and Data Library

Share