About Robin Rice

Robin on Twitter: Sparrowbarley

Research Data Workshops: DataVault Summary

Having soft-launched the DataVault facility in early 2019, the Research Data Support team -with the support of the project board – held five workshops in different colleges and locations to find out what the user community thought about it. This post summarises what we learned from participants, who were made up roughly equally of researchers (mainly staff) and support professionals (mainly computing officers based in the Schools and Colleges).

Each workshop began with presentations and a demonstration by Research Data Service staff, explaining the rationale of the DataVault, what it should and should not be used for, how it works, how the University will handle long-term management of data assets deposited in the DataVault, and practicalities such as how to recover costs through grant proposals or get assistance to deposit.

After a networking lunch we held discussion groups, covering topics such as prioritisation of features and functionality, roles such as the university as data asset owner, and the nature of the costs (price).

The team was relieved to learn that the majority (albeit from a somewhat self-selecting sample) agreed that the service fulfilled a real need; some data does need to be kept securely for a named period to comply with research funders’ rules, and participants welcomed a centralised platform to do this. The levels of usability and functionality we have managed to reach so far were met with somewhat less approval: clearly the development team has more work to do, and we are glad to have won further funding from the Digital Research Services programme in 2019-2020 in order to do it.

Attitudes toward university ownership of data assets was also a mixed bag; some were sceptical and wondered if researchers would participate in such a scheme, but others found it a realistic option for dealing with staff turnover and the inevitability of data outlasting data owners. Attitudes toward cost were largely accepting (the DataVault provides a cheaper alternative than our baseline DataStore disk storage), but concerns about the safekeeping of legacy and unfunded research data were raised at each workshop.

A sample of points raised follows:

  • Utility? “Everyone I know has everything on OneDrive.”
  • Regarding prioritisation of features – security first; file integrity first; putting data from other sources than DataStore; facilitating larger deposit sizes; ease of use.
  • Quickness of deposit and retrieval? Deposit was deemed more important to be quick than retrieval.
  • University as data asset owner?
    • Under GDPR the data are already university assets (because the Uni is the data controller).
    • People who manage the data should be close to the research; IT people can manage users but shouldn’t be making decisions about data. Danger that because it’s related to IT it gets dumped on IT officers. The formal review process helps to ensure decisions will be made properly. Include flexibility into the review hierarchy to allow for variation in school infrastructure.
    • When I heard that I was – not shocked – but concerned. If I move to another university how do I get access? This might be a problem. Researchers might prefer to retain three copies themselves.
  • Is the cost recovery mechanism valid?
    • Vault costs are legitimate costs.
    • Ideally should come from grant overheads, until then need to charge.
    • Possible to charge for small / medium/large project at start rather than per TB?
  • Is the 100 GB threshold sufficient for unfunded research? How else could unfunded or legacy data be covered (who pays)?
    • Alumni sponsor a dataset scheme?
    • There will be people with a ‘whole bunch of data somewhere’ that would be more appropriately stored in DataVault.

The team is grateful to all of the workshop participants for their time and thoughts; the report will be considered further by the project board and the Research Data Service Steering Group members. The full set of workshop notes are colour-coded to show comments from different venues and are available to read on the RDM wiki, for anyone with a University log-in (EASE).


Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Share

FAIR dues to the Research Data Alliance

It has been a while since we’ve blogged about the Research Data Alliance (RDA), and as an organisation it has come into its own since its beginnings in 2013. One can count on discovering the international state of the art in a range of data-related topics covered by its interest groups and working groups which meet at its plenary events, held every six months. That is why I attended the 13th RDA Plenary held in Philadelphia earlier this month and I was not disappointed.

I arrived Monday morning in time for the second day of a pre-conference sponsored by CODATA on FAIR and Responsible Research Data Management at Drexel University. FAIR is a popular concept amongst research funders for illustrating data management done right: by the time you complete your research project (or shortly after) your data should be Findable, Accessible, Interoperable and Reusable.

Fair enough, but we data repository providers also want to know how to build the ecosystems that will make it super-easy for researchers to make their data FAIR, so we need to talk to each other to compare notes and decide exactly what each letter means in practice.

Borrowed from OpenAire 

Amongst the highlights were some tools and resources for researchers or data providers mentioned by various speakers.

  • The Australian Research Data Commons (ARDC) has created a FAIR self-assessment tool.
  • For those who like stories, the Danish National Archives have created a FAIRytale to help understand the FAIR principles.
  • ARDC with Library Carpentry conducted a sprint that led to a disciplinary smorgasbord called Top Ten Data and Software Things.
  • DataCite offers a Repository Finder tool through its union with re3data.org to find the most appropriate repository in which to deposit your data.
  • Resources for “implementation networks” from the EU-funded project GO FAIR, including training materials under the rubric of GO TRAIN.
  • The Geo-science focused Enabling FAIR Data Project is signing up publishers and repositories to commitment statements, and has a user-friendly FAQ explaining why researchers should care and what they can do.
  • A brand new EU-funded project, FAIRsFAIR (Fostering FAIR Data Practice in Europe) is taking things to the next level, building new networks to certifying learners and trainers, researchers and repositories in FAIRdom.

That last project’s ambitions are described in this blog post by Joy Davidson at DCC. Another good blog post I found about the FAIR pre-conference event is by Rebecca Springer at Ithaka S+R. If I get a chance I’ll add another brief post for the main conference.

Robin Rice
Data Librarian & Head of Research Data Support
Library & University Collections

Share

We’re hiring!

Information Services has a new vacancy for a Data Safe Haven Operations Assistant to work directly with the Data Safe Haven Manager in the Research Data Support Team in providing operational support for the Data Safe Haven and its users across the University. This is an excellent opportunity for an enthusiastic researcher or professional to apply their academic and support skills to a growing service area, and to help build and raise awareness of our new Data Safe Haven.

You will have research experience and knowledge of current data protection regulations and other relevant legislation in the context of research. You will have an understanding of university structures and norms. You will know how to work methodically and transparently, following and documenting standard operating procedures. You will document and present the service for different audiences to ensure high levels of uptake and engagement with the service.

The Data Safe Haven Operations Assistant is a key role in the development and delivery of the new Data Safe Haven component of the Research Data Service, delivered by Library Research Support together with other sections of Information Services. The role allows the post-holder to contribute to defining the way the Data Safe Haven service will operate within the University, including achieving standards-based certifications.

This is a fixed-term full-time position for two years. Funded by the Digital Research Services programme, you will be part of a collaborative, engaging, and innovative working environment within Information Services. There are many advantages to working at the University. Benefits include flexible working, an excellent pension, career prospects and generous holiday provision.

Closing date: 15th February, 2019

Full details are available at: https://www.vacancies.ed.ac.uk/pls/corehrrecruit/erq_jobspec_version_4.jobspec?p_id=046776

On behalf of Cuna Ekmekcioglu
Data Safe Haven Manager

Share

DataVault is now live

After extended development, the Research Data Service’s DataVault system is now operational, adding value to research data for principal investigators and their funders alike by offering a long-term retention solution for important datasets.

DataVault is a companion service to DataShare, the institutional digital repository for researchers to openly license and share datasets and related outputs via the Web. DataVault comprises an online interface connected to the university’s data centre infrastructure and cloud storage.

Each research project can store data in a single vault made up of any number of deposits. DataVault is currently able to accept individual deposits (groups of files) of up to 2 TB each; this will increase over time as project development continues.

DataVault sprint meeting before launch

Immutable

DataVault is designed for long-term retention of research data, to meet funder requirements and ensure future access to high value datasets. It meets digital preservation requirements by storing three copies in different locations (two on tape, one in the cloud) with integrity checking built-in, so that the data owner can retrieve their data with confidence until the end of the retention period (typically ten years).

Secure

The DataVault interface helps to guide users in how to deposit personal and sensitive data, using anonymisation or pseudonymisation techniques whenever possible, as prescribed by the University’s Data Protection Officer (DPO). Because all data are encrypted before deposit, they are protected from unauthorised disclosure. Only the data owner or their nominated delegate is allowed to retrieve data during the retention period. Any decisions about allowing access to others are made by the data owner and are conducted outside the DataVault system, once they have been retrieved onto a private area on DataStore and decrypted.

Discoverable

Although DataVault offers a form of closed archive, the design encourages good research data management practice by requiring a metadata record for each vault in Pure. These records are discoverable on the Web, and linked to the respective data creators, projects and publications.

In exchange for creating this high level public metadata record, the Principal Investigator benefits from the assignment of a unique digital object identifier (DOI) which can be used to cite the data in publications.

The open nature of the metadata means that any reader may make a request to access the dataset. The data owner decides who may have access and under what conditions. Advice can be provided by the Research Data Support team and the DPO.

University data assets

DataVault’s workflow takes into account the possibility/likelihood that the original data owner will have left the university when the period of retention comes to an end. Each vault will be reviewed by representatives of the university in schools, colleges or the Library, acting as the data owner, to make decisions on disposal or further retention and curation. If kept, the vault contents become university data assets.

Plan ahead for data archiving

The Research Data Support team encourages researchers to plan ahead for data archiving, right from the earliest conception stages of the project, so that appropriate costs are included in bids, and enabling the appropriate steps to be carried out to prepare data for either open or closed long-term archiving.

The team can be contacted through the IS Helpline and offers assistance with writing data management plans and making archival decisions. See our service website and contact information at https://www.ed.ac.uk/is/research-data-service or go straight to the DataVault page to learn more about it, get instructions for use, or look up charges. An introductory demo video is available  at  https://media.ed.ac.uk/media/Getting+started+with+the+DataVault/1_h4r4glf7 .

Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Share