About Robin Rice

Robin on Twitter: Sparrowbarley

MANTRA @ Melbourne

The aim of the Melbourne_MANTRA project was to review, adapt and pilot an online training program in research data management (RDM) for graduate researchers at the University of Melbourne. Based on the UK-developed and acclaimed MANTRA program, the project reviewed current UK content and assessed its suitability for the Australian and Melbourne research context. The project team adapted the original MANTRA modules and incorporated new content as required, in order to develop the refreshed Melbourne_MANTRA local version. Local expert reviewers ensured the localised content met institutional and funder requirements. Graduate researchers were recruited to complete the training program and contribute to the detailed evaluation of the content and associated resources.

The project delivered eight revised training modules, which were evaluated as part of the pilot via eight online surveys (one for each module) plus a final, summative evaluation survey. Overall, the Melbourne_MANTRA pilot training program was well received by participants. The content of the training modules generally gathered high scores, with low scores markedly sparse across all eight modules. The participants recognised that the content of the training program should be tailored to the institutional context, as opposed to providing general information and theory around the training topics. In its current form, the content of the modules only partly satisfies the requirements of our evaluators, who made valuable recommendations for further improving the training program.

In 2016, the University of Melbourne will revisit MANTRA with a view to implement evaluation feedback into the program; update the modules with new content, audiovisual materials and exercises; augment targeted delivery via the University’s LMS; and work towards incorporating Melbourne_MANTRA in induction and/or reference materials for new and current postgraduates and early career researchers.

The current version is available at: http://library.unimelb.edu.au/digitalscholarship/training_and_outreach/mantra2

Dr Leo Konstantelos
Manager, Digital Scholarship
Research | Research & Collections
Academic Services
University of Melbourne
Melbourne, Australia

Share

Jisc Data Vault update

Posted on behalf of Claire Knowles

Research data are being generated at an ever-increasing rate. This brings challenges in how to store, analyse, and care for the data. Part of this problem is the long term stewardship of researchers’ private data and associated files that need a safe and secure home for the medium to long term.

PrintThe Data Vault project, funded by the Jisc #DataSpring programme seeks to define and develop a Data Vault software platform that will allow data creators to describe and store their data safely in one of the growing number of options for archival storage. This may include cloud solutions, shared storage systems, or local infrastructure.

Future users of the Data Vault are invited to Edinburgh on 5th November, to help shape the development work through discussions on: use cases, example data, retention policies, and metadata with the project team.

Book your place at: https://www.eventbrite.co.uk/e/data-vault-community-event-edinburgh-tickets-18900011443

The aims of the second phase of the project are to deliver a first complete version of the platform by the end of November, including:

  • Authentication and authorisation
  • Integration with more storage options
  • Management / monitoring interface
  • Example interface to CRIS (PURE)
  • Development of retention and review policy
  • Scalability testing

Working towards these goals the project team have had monthly face-to-face meetings, with regular Skype calls in between. The development work is progressing steadily, as you can see via the Github repository: https://github.com/DataVault, where there have now been over 300 commits. Progress is also tracked on the open Project Plan where anyone can add comments.

So remember, remember the 5th November and book your ticket.

Claire Knowles, Library & University Collections, on behalf of the JISC Data Vault Project Team

Share

Research Data Alliance – report from the 6th Plenary

The Research Data Alliance or RDA is growing about as fast as the data all around us. It got off the ground in 2012 with the support of major research funders in Europe, the US and Australia and has since grown to over 3,000 members. The latest plenary in Paris set a new registration record of ~700 ‘data folk’ including data scientists, data managers, librarians and policy-makers. The theme was Enterprise Engagement with a focus on Research Data for Climate Change.

Not an ordinary conference

What sets RDA apart from other data-related organisations is not just the size of its gatherings, but its emphasis on making change. Parallel sessions are not filled with individual presentations of research papers, but of collaborative activities that lead to outputs that can be used in the real world. Working groups are approved by governance structures that coalesce around actual problems that cannot be solved by individual organisations but require new top-level approaches. They are required to produce their deliverables and close shop after an 18 month period. Interest groups are allowed to exist longer, but are encouraged to spin off working groups to address changes as they are identified through group discussion.

Hard-working groups

Since 2012, these working groups have produced some impressive deliverables and pilots that if implemented across the Web and across organisations and countries could speed up research and improve reproducibility. They are governed by an elected group of experts, worldwide. Some current active projects are:

  • Data Foundation and Terminology WG: defining harmonised terminology for diverse communities used to their own data ‘language’
  • Data Type Registries WG: building software to implement a DTR that can automatically match up unknown dataset ‘types’ with relevant services or applications (such as a viewer)
  • PID Information Types WG: Creating a single common API for delivering checksums from multiple persistent identifier service providers (DataCite and others)
  • Practical policy WG: building on a previous WG that collected various machine-actionable policies practiced by different data centres and repositories, this group will register the policies to move repository managers to move towards a harmonised set.
  • Scalable Dynamic Data Citation WG: to solve the difficulty of properly citing dynamic data sources, the recommended solution allows users to re-execute a query with the original time stamp and retrieve the original data or to obtain the current version of the data.
  • Data Description Registry Interoperability WG: to solve the problem of scattered datasets across repositories and data registries, the group build Research Data Switchboard linking datasets across platforms.
  • Metadata Standards Directory WG: By guiding researchers towards the metadata standards and tools relevant to their discipline, the directory drives up adoption of those standards, improving the chances of future researchers finding and using the data.

Members of the RDM team have been involved in library and repository-related interest groups and Birds of a Feather groups, where surveys of current practice have circulated.

Not all men at RDA! Dame Wendy Hall from the Web Science Institute leads a Women's Networking Breakfast

Not all men at RDA! Dame Wendy Hall from the Web Science Institute leads a Women’s Networking Breakfast – photo courtesy of @RDA_Europe

RDA and climate change

Climate science was prominent in the 6th RDA plenary. This was not only due to the imminent Paris-based United Nations COP talks, but indeed due to issues of critical importance for the world today. For some years, driven by the climate model inter-comparison work underpinning Intergovernmental Panel on Climate Change (IPCC) reports and the massive datasets from Earth observation climate science has been located at an intersection of high performance computing, big data management, and services to support and stimulate research, commerce, and governmental initiatives.

Assessment of the risks posed by climate change, and strategies for adaptation and mitigation sharpens the need to solve not only the technical problems of bringing together diverse data (social, soil, climate, land-use, commercial,…) but also to address the policy challenges, given the diverse organisations needing to cooperate. This is a domain that builds on services to give access to data, for computation close to data enabled by e-infrastructure (such as EGI), and one that requires ever stronger approaches to brokering these resources and services, to permit their orchestration and integration.

Among initiatives presented in the climate-related sessions were:

  • GEOSS – The GEOSS Common Infrastructure allows the user of Earth observations to access, search and use the data, information, tools and services available through the Global Earth Observation System of Systems
  • Global Agricultural Monitoring (GEOGLAM) initiative in response to the growing calls for improved agricultural information.
  • An RDS group focused on wheat – the volatility in prices, in part driven by climate unpredictability, has become a major concern.
  • The IPSL Mesocentre
  • IS-ENES developing services for climate modelling especially
  • Copernicus, seeking to “support policymakers, business, and citizens with improved environmental information. Copernicus integrates satellite and in-situ data with modeling to provide user-focused information services”
  • CLIPC will provide access to climate datasets, and software and information to assess indicators for climate impact.

Dr. Mike Mineter, School of GeoSciences and Robin Rice, EDINA and Data Library

 

 

Share

Edinburgh DataShare – new features for users and depositors

I was asked recently on Twitter if our data library was still happily using DSpace for data – the topic of a 2009 presentation I gave at a DSpace User Group meeting. In responding (answer: yes!) I recalled that I’d intended to blog about some of the rich new features we’ve either adopted from the open source community or developed ourselves to deliver our data users and depositors a better service and fulfill deliverables in the University’s Research Data Management Roadmap.

Edinburgh DataShare was built as an output of the DISC-UK DataShare project, which explored pathways for academics to share their research data over the Internet at the Universities of Edinburgh, Oxford and Southampton (2007-2009). The repository is based on DSpace software, the most popular open source repository system in use, globally.  Managed by the Data Library team within Information Services, it is now a key component in the UoE’s Research Data Programme, endorsed by its academic-led steering group.

An open access, institutional data repository, Edinburgh DataShare currently holds 246 datasets across collections in 17 out of 22 communities (schools) of the University and is listed in the Re3data Registry of Research Data Repositories and indexed by Thomson-Reuters’ Data Citation Index.

Last autumn, the university joined DataCite, an international standards body that assigns persistent identifiers in the form of Digital Object Identifiers (DOIs) to datasets. DOIs are now assigned to every item in the repository, and are included in the citation that appears on each landing page. This helps to ensure that even after the DataShare system no longer exists, as long as the data have a home, the DOI will be able to direct the user to the new location. Just as importantly, it helps data creators gain credit for their published data through proper data citation in textual publications, including their own journal articles that explain the results of their data analyses.

CaptureThe autumn release also streamlined our batch ingest process to assist depositors with large and voluminous data files by getting around the web upload front-end. Currently we are able to accept files up to 10 GB in size but we are being challenged to allow ever greater file sizes.

Making the most of metadata

Discover panel screenshot

Example from Geosciences community

Every landing page (home, community, collection) now has a ‘Discover’ panel giving top hits for each metadata field (such as subject classification, keyword, funder, data type, spatial coverage). The panel acts as a filter when drilling down to different levels,  allowing the most common values to be ‘discovered’ within each section.

The usage statistics at each level  are now publicly viewable as well, so depositors and others can see how often an item is viewed or downloaded. This is useful for many reasons. Users can see what is most useful in the repository; depositors can see if their datasets are being used; stakeholders can compare the success of different communities. By being completely open and transparent, this is a step towards ‘alt-metrics’ or alternative ways measuring scholarly or scientific impact. The repository is now also part of IRUS-UK, (Institutional Repository Usage Statistics UK), which uses the COUNTER standard to make repository usage statistics nationally comparable.

What’s coming?

Stay tuned for future improvements around a new look and feel, preview and display by data type, streaming support, bittorent downloading, and Linked Open Data.

Robin Rice
EDINA and Data Library

Share