Analytics platform trial

Information Services is evaluating a new collaborative platform for data-science and analytics as part of its expanding portfolio of services for researchers. We are looking for researchers with suitable problems who expect to achieve results in the one-year trial. We will be able to work closely with a small number of projects to help them get the most out of the platform, and training will be available. In addition, we encourage further researchers to use the platform with less formal support.

The Aridhia AnalytiXagility Platform

AnalytiXagility is a purpose-built, user-friendly, collaborative platform for data science and analytics. It allows your team to easily create, discuss, modify and share analyses in a single, secure system accessed conveniently through a web browser.
The platform handles routine data management tasks such as confidentiality, availability, integrity and audit, reducing time to insight and discovery. In particular, it is ideally suited for:

  • Exploring, comparing and linking structured datasets including data quality profiling
  • Supporting data management, accountability and provenance
  • Processing large datasets that do not fit in memory

Bring your team

Project members collaborate through a private workspace configured with compute, storage and analytical tools. Embedded social media tools allow teams to post and share questions, updates, comments and insights, building an active record of the research undertaken.

Bring your data

Users import their datasets using the secure and reliable file transfer mechanism, SFTP. Working files (documents, images, analysis scripts) can be uploaded directly through the web interface, and tagged for easy management and retrieval by the team.

Bring your analysis

AnalytiXagility provides an analysis platform, based on R, which can be accessed through a web browser. Combining R with an SQL database and an associated access library allows researchers to analyse their data in a faster and more scalable way than with R alone.

Generate your output

The platform supports generation of PDF reports for communication and publication using LaTeX templates, such as those provided by many leading journals, in which users can embed active analytical scripts to auto-generate images and tabular data within the report at runtime.

More information

If you are interested in participating in the trial, please email IS.Helpline@ed.ac.uk with the subject “XAP Trial”.

Further information can be found at:

Steve Thorn
Research Services
IT Infrastructure

Share

Dancing with Data

I went to an interesting talk yesterday by Prof Chris Speed called “Dancing with Data”, on how our interactions and relationships with each other, with the objects in our lives and with companies and charities are changing as a result of the data that is now being generated by those objects (particularly smartphones, but increasingly by other objects too). New phenomena such as 3D printing, airbnb, foursquare and iZettle are giving us choices we never had before, but also leading to things being done with our data which we might not have expected or known about. The relationships between individuals and our data are being re-defined as we speak. Prof Speed challenged us to think about the position of designers in this new world where push-to-pull markets are being replaced by new models. He also told us about his research collaborations with Oxfam, looking at how technology might enhance the value of the second-hand objects they sell by allowing customers to hear their stories from their previous owners.   Logo for the Tales of Things project

All very thought-provoking, but what about the implications for academic research, aside from those working in the fields of Design, Economics or Sociology who must now develop new models to reflect this changing landscape? Well, the question arises, if all this data is being generated and collected by companies, are the academics (and indeed the charity sector) falling behind the curve? Here at the University of Edinburgh, my colleagues in Informatics are doing Data Science research, looking into the infrastructure and the algorithms used to analyse the kind of commercial Big Data flowing out of the smartphones in our pockets, while Prof Speed and his colleagues are looking at how design itself is being affected. But perhaps academics in all disciplines need to be tuning their antennae to this wavelength and thinking seriously about how their research can adapt to and be enhanced by the new ways we are all dancing with data.

For more about the University of Edinburgh’s Design Informatics research and forthcoming seminars see www.designinformatics.org. Prof Chris Speed tweets @ChrisSpeed.

Pauline Ward is a Data Library Assistant working at the University of Edinburgh and EDINA.

Share

Dealing with Data Conference & RDM Service Launch – summary

University of Edinburgh Research Data Management Service LogoInformation Services (IS) held a half-day conference in the Main Library on the subject of ‘Dealing with Data’ to coincide with the launch of the University of Edinburgh’s Research Data Management support services on 26 August.

University researchers presented to over 120 delegates from across the disciplinary and support spectrum on many aspects of working with data, particularly research with novel methods of creating, using, storing, or sharing data. Subjects included Big Data for disease control, managing West Nilotic language sound files, sharing brain images, geospatial metadata services, visualising qualitative data via carpets!

Dealing with Data Conference

The RDM Programme team are currently collecting feedback and will report on this and the conference in more detail via this blog.

‘Dealing with Data Conference’ delegates then gathered in the Main Library foyer to hear brief talks by Professor Jeff Haywood, Professor Peter Clarke and Dr John Scally followed by the formal launch of the RDM Services by the University’s Principal, Sir Timothy O’Shea who underlined the successful collaboration between research and support service communities in establishing research support services worthy of a leading UK research-intensive university.

University of Edinburgh RDM Service launch by Sir Timothy O'Shea

A ‘storify’ story of tweets collected during the launch and the conference is available, with pictures and perspectives from various attendees.

The launch of the IS-led RDM Services is the culmination of work detailed in the RDM Roadmap which began in earnest in August 2012 following approval of the RDM Policy by the University Court in May 2011.

Details of available and planned RDM Services for University of Edinburgh researchers were reported on in the blogpost: RDM Roadmap: Completion of Phase 1

Conference presentations can be downloaded from Edinburgh Research Archive (ERA) at: https://www.era.lib.ed.ac.uk/handle/1842/9389

Stuart Macdonald
RDM Service Coordinator
stuart.macdonald@ed.ac.uk

Share

Data and ethics

As an academic support person, I was surprised to find myself invited onto a roundtable about ‘The Ethics of Data-Intensive Research’. Although as a data librarian I’m certainly qualified to talk about data, I was less sure of myself on the ethics front – after all, I’m not the one who has to get my research past an Ethics Review Board or a research funder.

The event was held last Friday at the University of Edinburgh as part of the project Archives Now: Scotland’s National Collections and the Digital Humanities, a knowledge exchange project funded by the Royal Society of Edinburgh. This event attracted attendees across Scotland and had as its focus “Working With Data“.

I figured I couldn’t go wrong with a joke about fellow ‘data people’ with an image from flickr that we use in our online training course, MANTRA.

Binary-by-Xerones-CC-BY-NC

‘Binary’ by Xerones on Flickr (CC-BY-NC)

Appropriately, about half the people in the room chuckled.

So after introducing myself and my relevant hats, I revisited the quotations I had supplied on request for the organiser, Lisa Otty, who had put together a discussion paper for the roundtable.

“Publishing articles without making the data available is scientific malpractice.”

This quote is attributed to Geoffrey Boulton, Chair of the Royal Society of Edinburgh task force which published Science as an Open Enterprise in 2012. I have heard him say it, if only to say it isn’t his quote. The report itself makes a couple of references to things that have been said that are similar, but are just not as pithy for a quote. But the point is: how relevant is this assertion for scholarship that is outside of the sciences, such as the Humanities? Is data sharing an ethical necessity when the result of research is an expressive work that does not require reproducibility to be valid?

I gave Research Data MANTRA’s definition of research data, in order to reflect on how well it applies to the Humanities:

Research data are collected, observed, or created, for the purposes of analysis to produce and validate original research results.

When we invented this definition, it seemed quite apt for separating ‘stuff’ that is generated in the course of research from stuff that is the object of research; an operational definition, if you will. For example, a set of email messages may just be a set of correspondences; or it may be the basis of a research project if studied. It all depends on the context.

But recently we have become uneasy with this definition when engaging with certain communities, such as the Edinburgh College of Art. They have a lot of digital ‘stuff’ – inputs and outputs of research, but they don’t like to call it data, which has a clinical feel to it, and doesn’t seem to recognise creative endeavour. Is the same true for the Humanities, I wondered? Alas, the audience declined to pursue it in the Q&A, so I still wonder.

“The coolest thing to do with your data will be thought of by someone else.”                          – Rufus Pollock, Cambridge University and Open Knowledge Foundation, 2008

My second quote attempted to illustrate the unease felt by academics about the pressure to share their data, and why the altruistic argument about open data doesn’t tend to win people over, in my experience. I asked people to consider how it made them feel, but perhaps I should have tried it with a show of hands to find out their answers.

Information Wants to Be Free

Quote by John Perry Barlow, image by Robin Rice

I swiftly moved on to talk about open data licensing, the choices we’ve made for Edinburgh DataShare, and whether offering different ‘flavours’ of open licence are important when many people still don’t understand what open licences are about. Again I used an image from MANTRA (above) to point out that the main consideration for depositors should be whether or not to make their data openly available on the internet – regardless of licence.

By putting their outputs ‘in the wild’ academics are necessarily giving up control over how they are used; some users will be ‘unethical’; they will not understand or comply with the terms of use. And we as repository administrators are not in a position to police mis-use for our depositors. Nevertheless, since academic users tend to understand and comply with scholarly norms about citing and giving attribution, those new to data sharing should not be unduly alarmed about the statement illustrated above. (And DataShare provides a ‘suggested citation’ for every data item that helps the user comply with the attribution requirements.)

Since no overview of data and ethics would be complete without consideration given to confidentiality obligations of researchers towards their human subjects, I included a very short video clip from MANTRA, of Professor John MacInnes speaking about caring for data that contain personally identifying information or personal attributes.

YouTube Preview Image

For me the most challenging aspect of the roundtable and indeed the day, was the contribution by Dr Anouk Lang about working with data from social media. As an ethical researcher one cannot assume that consent is unnecessary when working with data streams (such as twitter) that are open to public viewing. For one thing, people may not expect views of their posts outside of their own circles – they treat it as a personal communication medium. For another they may assume that what they say is ethereal and will soon be forgotten and unavailable. A show of hands indicated only some of the audience had heard of the Twitter Developers and API, or Storify, which can capture tweets and other objects in a more permanent web page, illustrating her point.

While this whole area may be more common for social researchers – witness the Economic and Social Research Council’s funding of a Big Data Network over several years which includes social media data – Anouk’s work on digital culture proves Humanities researchers cannot escape “the plethora of ethics, privacy and risk issues surrounding the use (and reuse) of social media data.” (Communication on ESRC Big Data Network Phase 3.)

Robin Rice
Data Librarian

Share