Data Carpentry & Software Carpentry workshops

The Research Data Service hosted back to back 2-day workshops in the Main Library this week, run by the Software Sustainability Institute (SSI) to train University of Edinburgh researchers in basic data science and research computing skills.

Learners at Data Carpentry workshop

Learners at Data Carpentry workshop

Software Carpentry (SC) is a popular global initiative originating in the US, aimed at training researchers in good practice in writing, storing and sharing code. Both SC and its newer offshoot, Data Carpentry, teaches methods and tools that helps researchers makes their science reproducible. The SSI, based at Edinburgh Parallel Computing Centre (EPCC), organises workshops for both throughout the UK.

Martin Callaghan, University of Leeds

Martin Callaghan, University of Leeds, introduces goals of Data Carpentry workshop.

Each workshop is taught by trainers trained by the SC organisation, using proven methods of delivery, to learners using their own laptops, and with plenty of support by knowledgeable helpers. Instructors at our workshops were from Leeds and EPCC. Comments from the learners – staff and postgraduate students from a range of schools, included, ‘Variety of needs and academic activities/disciplines catered for. Useful exercies and explanations,’ and ‘Very powerful tools.’

Lessons can vary between different workshops, depending on the level of the learners and their requirements, as determined by a pre-workshop survey. The Data Carpentry workshop on Monday and Tuesday included:

  • Using spreadsheets effectively
  • OpenRefine
  • Introduction to R
  • R and visualisation
  • Databases and SQL
  • Using R with SQLite
  • Managing Research & Data Management Plans

The Software Carpentry workshop was aimed at researchers who write their own code, and covered the following topics:

  • Introduction to the Shell
  • Version Control
  • Introduction to Python
  • Using the Shell (scripts)
  • Version Control (with Github)
  • Open Science and Open Research
Software Carpentry learners

Software Carpentry learners

Clearly the workshops were valued by learners and very worthwhile. The team will consider how it can offer similar workshops in the future at a similarly low cost; your ideas welcome!

Robin Rice
EDINA and Data Library


Sustainable software for research

In an earlier blog post (October 2013) Stuart Lewis discussed the 4 aspects of software preservation as detailed in a paper by Matthews et al, A Framework for Software Preservation, namely:

      1. Storage: is the software stored somewhere?


      2. Retrieval: can the software be retrieved from wherever it is stored?


      3. Reconstruction: can the software be reconstructed (executed)?


    4. Replay: when executed, does the software produce the same results as it did originally?

It is with these thoughts in mind that colleagues (1 December 2014) from across IS (Applications Division, EDINA, Research and Learning Services, DCC, IT Infrastructure) met with Neil Chue Hong (Director of the Software Sustainability Institute) (SSI) to discuss how the University of Edinburgh could move forward on the thorny issue of software preservation.

SSI_and_IS_software meeting_dec2014

The take home message agreed by all at the meeting was that it will be easier to look after software in the future if software is managed well just now.

In terms of progressing thinking in this regard there were more questions than answers.

Matters to investigate include:

  • defining what we mean by research software: a spectrum from single R analysis scripts through to large software platforms
  • capturing descriptions of locally created research software products in the Pure Data Asset Registry
  • understanding the number of local research projects that are creating software
  • creating high-level guidance around software development and licensing (with links to SSI and OSS Watch)
  • providing skills and training for early carrer researchers (such as through the Software Carpentry initiative)
  • tools to measure software uptake/usage in local research
  • institutional use of GitLab and other software development tools
  • ascertaining instances and spend on GitHub across the University

“It’s impossible to conduct research without software, say 7 out of 10 UK researchers” or so says an SSI report surveying software generation as part of the research process in Russell Group institutions. Published in Times Higher Education (THE) the report and data that underpins the report are now available.

Much food for thought and further discussion!

Stuart Macdonald
RDM Service Coordinator