Biocode LIMS to be made Freely Available to DNA Barcoding Community
The Laboratory Information Management System (LIMS) developed for the Moorea Biocode Project is to be made publicly available as a free beta version. The Biocode Project is an ambitious undertaking to create the first comprehensive inventory of all species larger than a microbe in a complex tropical ecosystem (Moorea in French Polynesia). Funded through a $5.2 million grant to UC Berkeley from the Gordon and Betty Moore Foundation, and based out of the American and French research stations on Moorea, Biocode brings together the Smithsonian Institution, the CNRS of France, and other partners.
The Biocode LIMS and data analysis components of the project were developed by Biomatters in collaboration with the Biocode Project researchers as a plug-in for Biomatters’ Geneious Pro sequence analysis software. The Geneious Biocode LIMS will give biologists around the world a best practice tool to use in their own research as well as to freely access the Moorea project’s final database. An accompanying “Biocode Genbank Submission” plug-in will allow researchers to upload their own sequence data from inside Geneious Pro directly to Genbank, the world’s largest public DNA sequence database.
The unique database of Moorea’s coral reef and terrestrial biodiversity will be publicly shared as a resource for ecologists and evolutionary biologists around the world. The Biocode LIMS has tracked over 24,000 specimens from over 30 phyla of algae, fungi, plants and animals in the first two years of the project.
Christopher Meyer, from the Smithsonian Institution and Director of the Biocode Project, says “The Moorea Biocode Project was created with the intention of providing a model system for similar comprehensive genetic inventories. In addition to tracking down all the biodiversity in this tropical island ecosystem, one of the promised deliverables has been an informatics tool to allow easy access to that data and to aid other genetic barcoding initiatives.”
“We see no sense in reinventing the wheel. We want to share our best practices from this ambitious project with anyone, from single researchers, to a principal investigator’s lab, to large scale initiatives like our own.”
The Biocode LIMS system provides an informatics pipeline for batch processing of samples from DNA extraction through to sequencing, identifying and re-running failed reactions, and identifying systematic errors that can be strategically addressed. It integrates with Geneious Pro’s existing sequence assembly tools and various Field Information Management Systems (FIMS), including TAPIR standard access protocols.
Once the specimen reaches the end of the pipeline, the Biocode Genbank submission plugin automates the submission of completed contigs to make the DNA sequences publically available. The reaction data from the LIMS database is combined with the field metadata from the FIMS database as a quality control mechanism including the completely tracked history.
Neil Davies, Director of UC Berkeley’s Gump South Pacific Research Station and Principal Investigator of the Biocode Project says, “This is the first freely available, broadly applicable software tool to assist tracking materials through the DNA barcoding pipeline. No other freely available program allows the level of tracking and data quality assurance through a lab system. Importantly, it goes beyond DNA barcoding to accommodate multiple genetic markers for use in a broad range of biodiversity and ecogenomic studies.”
“The plate workflow approach taken has greatly simplified the process of identifying reaction failures, and setting them up to be run again. It manages data that used to be spread across multiple individuals and notebooks so we can search it, report success, or look for patterns. It has significantly reduced the human error that has been problematic in large-scale sequencing projects such as this in the past.
Biomatters CEO Candace Toner says, “We’ve been working with the Moorea Biocode Project for over two years now and have developed the software under demanding real world circumstances. Most sequencing projects focus on one or two species, while Biocode focuses on an entire ecosystem and involves researchers from multiple international institutions. This creates challenges to organise the vast amounts of information produced and manage the sheer volume of human handling involved. For reproducibility and transparency, the Biocode LIMS needs to store a full, publically available lab workflow for each specimen.”
A beta version of the free Geneious Biocode LIMS and Geneious Biocode Genbank plugins are available from http://software.mooreabiocode.org. Users of the free Geneious Basic software will be able to access and view the Biocode database upon completion of the project, but a commercial copy of Geneious Pro is required for data creation and analysis.
Biocode LIMS key features
• Connect to field databases (FIMS) in Excel or TAPIR format
• Search and retrieve data into Geneious from FIMS
• Workflow tracking in tubes, 48, 96 or 384-well plates
• Bulk tracking for DNA extractions, PCR and DNA sequencing reactions
• Mark reactions as passed/failed
• Cherry-pick passed/failed reactions to create new plates
• Bin sequences based on quality
• Verify taxonomy
• Upload to Genbank
A 20 minute introductory video on how to use the major features of the Geneious Biocode LIMS can be found at http://www.biomatters.com/assets/demonstrations/biocode.html