Software – a data scientists’ workbench
Posted August 8, 2017
Read our latest blog, introducing the IEA’s new Data Analyst Freja Hunt, who attended
a 2-day Python course run by Software Carpentry
When we think of science we might think of test tubes, microscopes, complex machines, satellites and telescopes, but the reality for many scientists is that most of the cutting edge discoveries happen in front of a computer – having taken the enormous amount of data collected from those test tubes, microscopes, complex machines, satellites or telescopes and processed, manipulated and analysed it using some software.
In fact the Software Research Group at the University of Southampton found that 69% of the scientists they surveyed felt that software was fundamental to their results, with 56% of them writing their own.
Maybe when we think of software we think of giants such as Microsoft or high-tech geeks talking in acronyms and obscure terms we don’t understand, but in fact those data analysis routines we write in R or Python are just as much a piece of software.
Scientists from most disciplines have picked up coding along the way as they’ve run into increasingly complex problems or increasingly large datasets without any formal background as to best practices, such as modularisation, version control and unit testing.
I picked up some of these good habits from observing the software developers during my time as a data analyst in adtech where software and big data are the backbone of business and, yes, where some of those high-tech geeks talking in acronyms and obscure terms live! But I still had missing pieces from my ad hoc learning process. Joining the IEA, where collaboration is key and high quality, robust outputs come as standard, I needed a better understanding of good software practices so my analytical work could interact smoothly with my colleagues’ technical work.
Software development workshops aimed specifically at scientists are a great place to start and that is exactly where I headed. I learned how to automate routine tasks in Bash, brushed up my Python and learned version control and collaboration in Git.
The emphasis on good practice goes beyond just my chosen language, it will hold me in good stead whatever applications the IEA tackles.
In fact, in the autumn the IEA is running our own such course, over two blocks of two days, helping scientists get the best results from their coding. Not only are your colleagues and collaborators going to thank you for version control and more readable, more maintainable code, you will have the gold standard of scientific endeavour – reproducibility. And even better, once your scientific breakthroughs reach the commercial world, perhaps via an organisation such as the IEA helping businesses implement those ideas, integration into commercial standards will be a breeze.
Book up for your next software training course
Our most popular training course, Software Development for Scientists, is running again in the autumn, register now.