
Image created using a Black Forest Labs model
The lab of the future is an automated vision that the scientific community is already constructing. With lab benches replaced by self-enclosed robotic hoods and liquid handlers, scientists can work on more complex tasks on their computers or remotely from a central office space. As we adopt automation to make the synthesis of large molecules in research organizations more like factories, their output increases by connecting large equipment such as shaking incubators, liquid dispensers, sterilization stations, purification instruments, centrifuges and vial cap/decappers. These automated labs can produce thousands of purified plasmids, proteins or antibodies each year, which is ideal for high-throughput processes.
While it might seem like the ghost accelerating these iterative processes is the robotic arms, it is actually the software that is key to the data workflows: tracking every sample, analyzing results and capturing data at each step. Often, it is now also automatically making decisions to discard, repeat, move to the next step in the workflow, or alert the human scientist for input when it detects an ambiguous result. To add additional layers of intricacy, sometimes multiple software systems are working together and exchanging information throughout the process. So, when building complex automated workflows for high-throughput processes in biopharma R&D, what should be considered?
Expedite processes with automation of data workflows
Drug discovery campaigns rely on high-throughput screening to quickly identify novel biotherapeutic candidates. For this type of research stream, hundreds to thousands of individual large molecules need to be produced. Automated processes allow for continuous workflows, including overnight and on the weekends but having a robust software system is key.
One of our partners, an automation specialist at a leading biopharma organization, underscored the staggering scale of data generated in a single automated molecular cloning batch where she works in a team of two, to have about 400 purified proteins or antibodies synthesized in a 10-day span. Without software to track all samples from DNA in transfection plates and all the metadata associated including cell line info, passage info, analytical results and everything else from the proteins production, it would be virtually impossible to keep track of the data. The integration of robust software systems is crucial to unlocking these efficiencies, allowing for smoother data capture and better reproducibility across experiments.
A well-designed automation software should understand each step of the process, the type of data captured and know when an operation was not performed as planned. Scientists can be alerted when there is a problem and when remote troubleshooting is required, including from home and in the middle of the night. Such automation saves biopharma time and resources by accomplishing these complicated tasks at a high volume while reducing the previously required work hours of CROs and other teams of scientists to obtain these products. Software today analyzes the data as soon as the raw data is captured and can automatically decide if the samples meet certain quality criteria for the next step in the workflow or need to be re-tested. Finally, the systems also perform automatic long-term monitoring to see how assays perform plate-to-plate, or week-to-week and compare reference samples, capturing problems long before the human eye can detect them.
Turning up the volume on data
Automating a task does not alleviate the process bottleneck if the data collected is not managed correctly. The automated production of exponentially more biomacromolecules from high-throughput systems results in exponentially more data. Automating software platforms, like Genedata, play a vital role in capturing, parsing and associating relevant information with the right batches and molecules of these huge workflows.
Another biopharma leader underscored the staggering scale of data generated in a single automated molecular cloning batch in their organization where a full end-to-end run has 1,152 samples and creates over 1,100,000 unique metadata and data points, with an average of 22 per sample. Without a data rail for fully integrated, high-quality dataset visualization and analytical tools for expedited decision-making, drug discovery campaigns would not be successful. Automation amplifies throughput but without intelligent data management, it can also amplify bottlenecks.
Conclusion
Previously, researchers were limited to testing only as many large molecules as they could afford to synthesize and analyze. Today, with automated production methods, we can explore a much broader range of molecules, significantly expand our scientific understanding and develop clinically valuable treatments faster and at lower costs. To do this, it is well understood that automating complex biopharma processes involves more than running liquid handlers for aliquoting and dosing. The success of these high-throughput processes depends on selecting the right software, especially for data workflow automation, to drive the entire operation. This starts with a centralized, structured system that can be initiated by the production request for novel large molecules and spans the entire synthetic process. Additionally, for traceability and audit reporting, the software must automatically track every step of the process and autonomously upload analytical results back to the corresponding samples. With the right software as the ghost in the machine, data workflow automation becomes a fully integrated driver of discovery.
Arielle Mann, Ph.D. ([email protected]), Scientific Communications Associate at Genedata.
Jana Hersch, Ph.D. ([email protected]), Head of Corporate Scientific Engagement at Genedata.



