DataVault user roles let you share access to archived data

The Edinburgh DataVault is a secure long-term retention solution for research data.

Thanks to the hard work of our software developers in the Digital Library and EDINA, the Edinburgh DataVault now facilitates five different user roles. This means busy PIs can delegate the work of depositing and retrieving data, to members of their team or other collaborators within the University. It also allows PIs to nominate support staff to deposit and retrieve data on their behalf, or grant access to new members of their team.

Diagram representing a PI and two postdocs using the roles of Owner and Nominated Data Manager to share access to data in the DataVault

There are five user roles:

  • Data Owner
    Usually the Principal Investigator. Can add/remove other users to their vault(s).
  • Nominated Data Manager (of a given vault)
    Can view and edit metadata fields, deposit data and retrieve any deposit in the vault. May add/remove Depositors to the vault.
  • Depositor (of a given vault)
    Can view the vault contents, deposit data and retrieve any deposit in the vault.
  • School Support Officer
    Acting on behalf of the Head of School, may view all vaults and associated deposits belonging to the School.
  • School Data Manager
    Assigned only with the express permission of the Head of School, may view, deposit into and retrieve data from any vault belonging to the School.

Full details of the permissions associated with each role:
Roles and permissions

Support staff who need to view reporting data for their School, or admin access to their School’s vaults, should attend our training – Edinburgh DataVault: supporting users archiving their research data.

Further information on why and how to use the DataVault is available on the Research Data Service website:
DataVault long-term retention

If you have any questions about using DataVault please don’t hesitate to contact the Research Data Support team at data-support@ed.ac.uk.

Pauline Ward, Research Data Support Assistant
Library and University Collections
@PaulineData

Share

Research Data Workshops: DataVault Summary

Having soft-launched the DataVault facility in early 2019, the Research Data Support team -with the support of the project board – held five workshops in different colleges and locations to find out what the user community thought about it. This post summarises what we learned from participants, who were made up roughly equally of researchers (mainly staff) and support professionals (mainly computing officers based in the Schools and Colleges).

Each workshop began with presentations and a demonstration by Research Data Service staff, explaining the rationale of the DataVault, what it should and should not be used for, how it works, how the University will handle long-term management of data assets deposited in the DataVault, and practicalities such as how to recover costs through grant proposals or get assistance to deposit.

After a networking lunch we held discussion groups, covering topics such as prioritisation of features and functionality, roles such as the university as data asset owner, and the nature of the costs (price).

The team was relieved to learn that the majority (albeit from a somewhat self-selecting sample) agreed that the service fulfilled a real need; some data does need to be kept securely for a named period to comply with research funders’ rules, and participants welcomed a centralised platform to do this. The levels of usability and functionality we have managed to reach so far were met with somewhat less approval: clearly the development team has more work to do, and we are glad to have won further funding from the Digital Research Services programme in 2019-2020 in order to do it.

Attitudes toward university ownership of data assets was also a mixed bag; some were sceptical and wondered if researchers would participate in such a scheme, but others found it a realistic option for dealing with staff turnover and the inevitability of data outlasting data owners. Attitudes toward cost were largely accepting (the DataVault provides a cheaper alternative than our baseline DataStore disk storage), but concerns about the safekeeping of legacy and unfunded research data were raised at each workshop.

A sample of points raised follows:

  • Utility? “Everyone I know has everything on OneDrive.”
  • Regarding prioritisation of features – security first; file integrity first; putting data from other sources than DataStore; facilitating larger deposit sizes; ease of use.
  • Quickness of deposit and retrieval? Deposit was deemed more important to be quick than retrieval.
  • University as data asset owner?
    • Under GDPR the data are already university assets (because the Uni is the data controller).
    • People who manage the data should be close to the research; IT people can manage users but shouldn’t be making decisions about data. Danger that because it’s related to IT it gets dumped on IT officers. The formal review process helps to ensure decisions will be made properly. Include flexibility into the review hierarchy to allow for variation in school infrastructure.
    • When I heard that I was – not shocked – but concerned. If I move to another university how do I get access? This might be a problem. Researchers might prefer to retain three copies themselves.
  • Is the cost recovery mechanism valid?
    • Vault costs are legitimate costs.
    • Ideally should come from grant overheads, until then need to charge.
    • Possible to charge for small / medium/large project at start rather than per TB?
  • Is the 100 GB threshold sufficient for unfunded research? How else could unfunded or legacy data be covered (who pays)?
    • Alumni sponsor a dataset scheme?
    • There will be people with a ‘whole bunch of data somewhere’ that would be more appropriately stored in DataVault.

The team is grateful to all of the workshop participants for their time and thoughts; the report will be considered further by the project board and the Research Data Service Steering Group members. The full set of workshop notes are colour-coded to show comments from different venues and are available to read on the RDM wiki, for anyone with a University log-in (EASE).


Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Share

Research Data Workshop Series 2019

Over the spring of 2019 the Research Data Service (RDS) is holding a series of workshops with the aim of gathering feedback and requirements from our researchers on a number of important Research Data topics.

Each workshop will consist of a small number of short presentations from researchers and research support staff who have experience of the topic. These will then be followed by guided discussions so that the RDS can gather your input on the tools we currently provide, the gaps in our services, and how you go about addressing the challenges and issues raised in the talks.
The workshops for 2019 are:

Electronic Notebooks 1
14th March at King’s Buildings (Fully Booked)

DataVault
1200-1400, 10th April at 6301 JCMB, King’s Buildings, Map
Booking Link – https://www.events.ed.ac.uk/index.cfm?event=book&scheduleID=34308
The DataVault was developed to offer UoE staff a long-term retention solution for research data collected by research projects that are at the completion stage. Each ‘Vault’ can contain multiple files associated with a research project that will be securely stored for an identified period, such as ten years. It is designed to fill in gaps left by existing research data services such as DataStore (active data storage platform) and DataShare (open access online data repository). The service enables you to comply with funder and University requirements to preserve research data for the long-term, and to confidently store your data for retrieval at a future date. This workshop is intended to gather the views of researchers and support staff in schools to explore the utility of the new service and discuss potential practicalities around its roll-out and long-term sustainability.

Sensitive Data Challenges and Solutions
1200-1430, 16th April in Seminar Room 2, Chancellors Building, Bioquarter, Map
Booking Link – https://www.events.ed.ac.uk/index.cfm?event=book&scheduleID=34321
Researchers face a number of technical, ethical and legal challenges in creating, analysing and managing research data, including pressure to increase transparency and conduct research openly. But for those who have collected or are re-using sensitive or confidential data, these challenges can be particularly taxing. Tools and services can help to alleviate some of the problems of using sensitive data in research. But cloud-based tools are not necessarily trustworthy, and services are not necessarily geared for highly sensitive data. Those that are may not be very user-friendly or efficient for researchers, and often restrict the types of analysis that can be done. Researchers attending this workshop will have the opportunity to hear from experienced researchers on related topics.

Electronic Notebooks 2
1200-1430, 9th May at Training & Skills Room, ECCI, Central Area, Map
Booking Link – https://www.events.ed.ac.uk/index.cfm?event=book&scheduleID=34287
Electronic Notebooks, both computational and lab-based, are gaining ground as productivity tools for researchers and their collaborators. Electronic notebooks can help facilitate reproducibility, longevity and controlled sharing of information. There are many different notebook options available, either commercially or free. Each application has different features and will have different advantages depending on researchers or lab’s requirements. Jupyter Notebook, RSpace, and Benchling are some of the platforms that are used at the University and all will be represented by researchers who use them on a daily basis.

Data, Software, Reproducibility and Open Research
Due to unforeseen circumstances this event has been postponed. We will update with the new event details as soon as they are confirmed.
In this workshop we will examine real-life use cases wherein datasets combine with software and/or notebooks to provide a richer, more reusable and long-lived record of Edinburgh’s research. We will also discuss user needs and wants, capturing requirements for future development of the University’s central research support infrastructure in line with (e.g.) the LERU Roadmap for Open Science, which the Library Research Support team has sought to map its existing and planned provision against, and domain-oriented Open Research strategies within the Colleges.

Kerry Miller
Research Data Support Officer
Library & University Collections

Share

DataVault is now live

After extended development, the Research Data Service’s DataVault system is now operational, adding value to research data for principal investigators and their funders alike by offering a long-term retention solution for important datasets.

DataVault is a companion service to DataShare, the institutional digital repository for researchers to openly license and share datasets and related outputs via the Web. DataVault comprises an online interface connected to the university’s data centre infrastructure and cloud storage.

Each research project can store data in a single vault made up of any number of deposits. DataVault is currently able to accept individual deposits (groups of files) of up to 2 TB each; this will increase over time as project development continues.

DataVault sprint meeting before launch

Immutable

DataVault is designed for long-term retention of research data, to meet funder requirements and ensure future access to high value datasets. It meets digital preservation requirements by storing three copies in different locations (two on tape, one in the cloud) with integrity checking built-in, so that the data owner can retrieve their data with confidence until the end of the retention period (typically ten years).

Secure

The DataVault interface helps to guide users in how to deposit personal and sensitive data, using anonymisation or pseudonymisation techniques whenever possible, as prescribed by the University’s Data Protection Officer (DPO). Because all data are encrypted before deposit, they are protected from unauthorised disclosure. Only the data owner or their nominated delegate is allowed to retrieve data during the retention period. Any decisions about allowing access to others are made by the data owner and are conducted outside the DataVault system, once they have been retrieved onto a private area on DataStore and decrypted.

Discoverable

Although DataVault offers a form of closed archive, the design encourages good research data management practice by requiring a metadata record for each vault in Pure. These records are discoverable on the Web, and linked to the respective data creators, projects and publications.

In exchange for creating this high level public metadata record, the Principal Investigator benefits from the assignment of a unique digital object identifier (DOI) which can be used to cite the data in publications.

The open nature of the metadata means that any reader may make a request to access the dataset. The data owner decides who may have access and under what conditions. Advice can be provided by the Research Data Support team and the DPO.

University data assets

DataVault’s workflow takes into account the possibility/likelihood that the original data owner will have left the university when the period of retention comes to an end. Each vault will be reviewed by representatives of the university in schools, colleges or the Library, acting as the data owner, to make decisions on disposal or further retention and curation. If kept, the vault contents become university data assets.

Plan ahead for data archiving

The Research Data Support team encourages researchers to plan ahead for data archiving, right from the earliest conception stages of the project, so that appropriate costs are included in bids, and enabling the appropriate steps to be carried out to prepare data for either open or closed long-term archiving.

The team can be contacted through the IS Helpline and offers assistance with writing data management plans and making archival decisions. See our service website and contact information at https://www.ed.ac.uk/is/research-data-service or go straight to the DataVault page to learn more about it, get instructions for use, or look up charges. An introductory demo video is available  at  https://media.ed.ac.uk/media/Getting+started+with+the+DataVault/1_h4r4glf7 .

Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Share