Data Carpentry & Software Carpentry workshops

The Research Data Service hosted back to back 2-day workshops in the Main Library this week, run by the Software Sustainability Institute (SSI) to train University of Edinburgh researchers in basic data science and research computing skills.

Learners at Data Carpentry workshop

Software Carpentry (SC) is a popular global initiative originating in the US, aimed at training researchers in good practice in writing, storing and sharing code. Both SC and its newer offshoot, Data Carpentry, teaches methods and tools that helps researchers makes their science reproducible. The SSI, based at Edinburgh Parallel Computing Centre (EPCC), organises workshops for both throughout the UK.

Martin Callaghan, University of Leeds

Each workshop is taught by trainers trained by the SC organisation, using proven methods of delivery, to learners using their own laptops, and with plenty of support by knowledgeable helpers. Instructors at our workshops were from Leeds and EPCC. Comments from the learners – staff and postgraduate students from a range of schools, included, ‘Variety of needs and academic activities/disciplines catered for. Useful exercies and explanations,’ and ‘Very powerful tools.’

Lessons can vary between different workshops, depending on the level of the learners and their requirements, as determined by a pre-workshop survey. The Data Carpentry workshop on Monday and Tuesday included:

  • Using spreadsheets effectively
  • OpenRefine
  • Introduction to R
  • R and visualisation
  • Databases and SQL
  • Using R with SQLite
  • Managing Research & Data Management Plans

The Software Carpentry workshop was aimed at researchers who write their own code, and covered the following topics:

  • Introduction to the Shell
  • Version Control
  • Introduction to Python
  • Using the Shell (scripts)
  • Version Control (with Github)
  • Open Science and Open Research
Software Carpentry learners

Clearly the workshops were valued by learners and very worthwhile. The team will consider how it can offer similar workshops in the future at a similarly low cost; your ideas welcome!

Robin Rice
EDINA and Data Library


Analytics platform trial

Information Services is evaluating a new collaborative platform for data-science and analytics as part of its expanding portfolio of services for researchers. We are looking for researchers with suitable problems who expect to achieve results in the one-year trial. We will be able to work closely with a small number of projects to help them get the most out of the platform, and training will be available. In addition, we encourage further researchers to use the platform with less formal support.

The Aridhia AnalytiXagility Platform

AnalytiXagility is a purpose-built, user-friendly, collaborative platform for data science and analytics. It allows your team to easily create, discuss, modify and share analyses in a single, secure system accessed conveniently through a web browser.
The platform handles routine data management tasks such as confidentiality, availability, integrity and audit, reducing time to insight and discovery. In particular, it is ideally suited for:

  • Exploring, comparing and linking structured datasets including data quality profiling
  • Supporting data management, accountability and provenance
  • Processing large datasets that do not fit in memory

Bring your team

Project members collaborate through a private workspace configured with compute, storage and analytical tools. Embedded social media tools allow teams to post and share questions, updates, comments and insights, building an active record of the research undertaken.

Bring your data

Users import their datasets using the secure and reliable file transfer mechanism, SFTP. Working files (documents, images, analysis scripts) can be uploaded directly through the web interface, and tagged for easy management and retrieval by the team.

Bring your analysis

AnalytiXagility provides an analysis platform, based on R, which can be accessed through a web browser. Combining R with an SQL database and an associated access library allows researchers to analyse their data in a faster and more scalable way than with R alone.

Generate your output

The platform supports generation of PDF reports for communication and publication using LaTeX templates, such as those provided by many leading journals, in which users can embed active analytical scripts to auto-generate images and tabular data within the report at runtime.

More information

If you are interested in participating in the trial, please email with the subject “XAP Trial”.

Further information can be found at:

Steve Thorn
Research Services
IT Infrastructure


Dancing with Data

I went to an interesting talk yesterday by Prof Chris Speed called “Dancing with Data”, on how our interactions and relationships with each other, with the objects in our lives and with companies and charities are changing as a result of the data that is now being generated by those objects (particularly smartphones, but increasingly by other objects too). New phenomena such as 3D printing, airbnb, foursquare and iZettle are giving us choices we never had before, but also leading to things being done with our data which we might not have expected or known about. The relationships between individuals and our data are being re-defined as we speak. Prof Speed challenged us to think about the position of designers in this new world where push-to-pull markets are being replaced by new models. He also told us about his research collaborations with Oxfam, looking at how technology might enhance the value of the second-hand objects they sell by allowing customers to hear their stories from their previous owners.   Logo for the Tales of Things project

All very thought-provoking, but what about the implications for academic research, aside from those working in the fields of Design, Economics or Sociology who must now develop new models to reflect this changing landscape? Well, the question arises, if all this data is being generated and collected by companies, are the academics (and indeed the charity sector) falling behind the curve? Here at the University of Edinburgh, my colleagues in Informatics are doing Data Science research, looking into the infrastructure and the algorithms used to analyse the kind of commercial Big Data flowing out of the smartphones in our pockets, while Prof Speed and his colleagues are looking at how design itself is being affected. But perhaps academics in all disciplines need to be tuning their antennae to this wavelength and thinking seriously about how their research can adapt to and be enhanced by the new ways we are all dancing with data.

For more about the University of Edinburgh’s Design Informatics research and forthcoming seminars see Prof Chris Speed tweets @ChrisSpeed.

Pauline Ward is a Data Library Assistant working at the University of Edinburgh and EDINA.