These days using a LIMS seems to feature in every scientist’s life, and for some small and medium-size labs, open source code is the way forward with a LIMS. In fact businesses have grown up around helping labs implement open source LIMS and learn to make modifications in house.
A bridge too far for a nonprofessional? Not according to Greg Wilson, who believes that most scientists can easily learn enough to slip into coding when necessary.
For the last sixteen years Greg has been teaching researchers the equivalent of basic lab skills for scientific computing. Since January 2012, the project he leads, Software Carpentry, has run over 200 two-day workshops in 21 countries free of charge, and Software Carpentry’s corps of over 200 volunteer instructors are on course to run three a week—or more—in 2015.
“Scientists spend an increasing amount of time building and using software,” Greg and colleagues wrote in a recent PLOS article, Best Practices for Scientific Computing.1 “However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort.”
Greg teaches bench scientists and others a handful of core software development practices such as writing maintainable code, using version control, automating repetitive tasks, and managing data. “We believe that software is just another kind of experimental apparatus and should be built, checked, and used as carefully as any physical apparatus. However, while most scientists are careful to validate their laboratory and field equipment, most do not know how reliable their software is. This can lead to serious errors, impacting the central conclusions of published research.”
Without further ado, here are Software Carpentry’s list of best practices:
Summary of Best Practices
Write programs for people, not computers.
- A program should not require its readers to hold more than a handful of facts in memory at once.
- Make names consistent, distinctive, and meaningful.
- Make code style and formatting consistent.
Let the computer do the work.
- Make the computer repeat tasks.
- Save recent commands in a file for re-use.
- Use a build tool to automate workflows.
Make incremental changes.
- Work in small steps with frequent feedback and course correction.
- Use a version control system.
- Put everything that has been created manually in version control.
Don’t repeat yourself (or others).
- Every piece of data must have a single authoritative representation in the system.
- Modularize code rather than copying and pasting.
- Re-use code instead of rewriting it.
Plan for mistakes.
- Add assertions to programs to check their operation.
- Use an off-the-shelf unit testing library.
- Turn bugs into test cases.
- Use a symbolic debugger.
Optimize software only after it works correctly.
- Use a profiler to identify bottlenecks.
- Write code in the highest-level language possible.
Document design and purpose, not mechanics.
- Document interfaces and reasons, not implementations.
- Re-factor code in preference to explaining how it works.
- Embed the documentation for a piece of software in that software.
Collaborate.
- Use pre-merge code reviews.
- Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems.
- Use an issue tracking tool.
Reference
- Wilson, G., Aruliah, D. A., Brown, C. Titus. Best Practices for Scientific Computing PLOS Biology, January 07, 2014. http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001745
Additional Reading
- Perkel, Jeffrey M. Coding Your Way Out of a Problem. Nature Methods. Vol. 8 No.7 July 2011 pp 541-543. http://www.nature.com/nmeth/journal/v8/n7/full/nmeth.1631.html
This article originally appeared in ALN Magazine. Helen Kelly is Contributing Editor, International. She may be reached at HelenKellyLtd@aol.com