
As in many other industries, big data has become critical to R&D because of its potential to accelerate and streamline the development of novel therapeutics. The quality of big data and the analytical rigor applied are vital to deriving useful insights from multiple and diverse data sources. Petabytes of data are meaningless unless researchers have the access, capabilities and tools to efficiently aggregate, mine and extract information that is most relevant and useful to their workflows. These tools need to be user-friendly and produce accurate results that minimize false positives and negatives.
New R&D models
Life science organizations are rapidly adopting and implementing new technologies that proliferate information and contribute to the expanding pool of big data. New R&D models are reshaping drug discovery and development. The following transformative processes are contributing to these changes:
- Disease understanding: ‘-Omics’ data is replacing research by trial and error to better explain complex molecular relationships that underlie a disease
- Target identification: Predictive software that models the structure of a target is complementing known protein target and molecular structure identification
- Target validation: Simulation models of drug bioactivity that predict human response are enhancing animal models designed to understand human disease
- Molecule discovery: Genetic engineering of living systems to target specific diseases is augmenting compound identification and candidate molecule synthesis via high-throughput screening
- Lead optimization: Novel structural mapping technologies that allow for rapid alterations are improving target-fitted molecular structure customization
- Preclinical testing: In vitro testing of human living tissue for safety and efficacy studies is increasing the understanding of physiological drug responses in animal models
In later stage drug development, information harnessed in early stage R&D needs to be translated into the clinic. Additional data needs to be accessed, structured, stored and analyzed across the R&D value chain to better inform patient selection for clinical trials, to anticipate and mitigate risk through better pharmacovigilance and to identify patterns in real-world evidence. All of these requirements demand better and more efficient information solutions and capabilities.
Medicine and diagnostics are converging with evolving R&D models. Personalized medicine technologies such as next-generation sequencing and whole genome profiling are becoming increasingly affordable. This convergence is creating the need for more sophisticated data extraction, aggregation and integration to enable actionable insights. In addition, accurate data interpretation requires access to high quality, expert-curated content so that scientists and healthcare providers can quickly and reliably assess the most up-to-date information about variants and associated phenotypes that can better inform their clinical decisions.
To illustrate this paradigm shift, in June, the Lung Master Protocol trial (Lung-MAP), a multi-drug, multi-arm, biomarker-driven squamous cell lung cancer clinical trial, used Foundation Medicine’s genomic profiling sequencing platform to match patients to one of several investigational treatments. This large-scale screening/clinical registration protocol is an unprecedented public-private collaboration among government, not-for-profit and for-profit organizations that may be a prototypical model for future drug registration trials. In the U.K., the National Lung Matrix trial is a similar initiative that is scheduled to start in soon of this year for patients with metastatic breast cancer.
Earlier this year, IBM Watson announced a collaboration with the New York Genome Center. This partnership focuses on 25 patients with glioblastoma multiforme, an aggressive and difficult-to-treat brain cancer. Leveraging the supercomputing power of IBM Watson, which can process terabytes of patient data and mine entire databases, each of the 25 individual patients’ genomes are being analyzed for the precise mutations they harbor. These genomic alterations are being compared to large volumes of data from other patients, as well as from published medical data, to identify and recommend the most potentially effective experimental treatments.
These collaborative examples illustrate how big data is being aggregated and integrated into research projects. The ability to prospectively extract actionable insights and to unlock the potential of personalized medicine is an exciting prospect. The combined effect of the data deluge in life sciences with the capacity to analyze that data is leading to new approaches in R&D. These approaches will mean that decision making becomes driven by the data available and by the analytics solutions that unlock the data – we’ll look more closely at this in part two.