Sophisticated, genomics-based tools have potential to transform cancer care within the next decade. Among these will be biomarkers that can distinguish between tumors based on their molecular characteristics, enabling ever-greater personalization of cancer treatments.
For example, data presented at the 2014 American Society of Clinical Oncology (ASCO) annual meeting included promising new targeted agents for common, hard-to-treat cancers including advanced differentiated thyroid cancer, relapsed chronic lymphocytic leukemia, advanced non-small cell lung cancer, and recurrent ovarian cancer. More recently, researchers at the University of California, San Diego School of Medicine identified a new biomarker that predicts whether glioblastoma patients will respond to chemotherapy using temozolomide. Circulating tumor DNA has also recently been confirmed to have potential as a non-invasive way to detect cancer, since somatic mutations are only present in tumor cell DNA, providing an extremely specific biomarker.
The challenge of diversity and complexity in oncology
Characterized biomarkers hold the promise of distinguishing between the diverse range of tumor types and complex pathways involved in cancer at the molecular level, allowing therapies to be tailored accordingly. For broad utility, such biomarkers must be reliable and easily accessible. For example, all cells contain genomic DNA and transcribed RNA, with the result that nucleic acids are stable, readily available biomarkers in tissue samples. Next generation sequencing (NGS) can identify and routinely analyze these nucleic acid-based biomarkers.
NGS pinpoints the exact order of nucleic acids, DNA, RNA and micro RNA. The Human Genome Project, a $3 billion, 13-year endeavor completed in 2003, used first-generation Sanger sequencing to determine the order of nucleotides in human chromosomes. The second-generation sequencing approach, NGS, uses massively parallel techniques to decode the order of nucleic acids in billions of fragments of DNA, making it possible for an entire genome to be sequenced in less than one day. This advance has dramatically cut the cost of analysis, to a current figure of around $1,000 to decode one human genome. Experts in the field are already discussing $100 genomes that may be interrogated in less than a day. Since 2008, sequencing cost reductions have outpaced Moore’s Law, an observation in the IT sector that the number of transistors in an integrated circuit doubles about every two years. As illustrated in (Figure 1), the cost to sequence a million nucleic-acid bases has fallen from around $5,000 in 2001 to $0.10 by 2012. Nonetheless, NGS still requires major investments in infrastructure to acquire, process and interpret the data.
Commercially available NGS platforms – offered by companies such as Illumina, Life Technologies (now Thermo Fisher Scientific) and Pacific Biosciences – bring NGS technology to a broad range of users in discovery and development. NGS can be carried out in several ways: sequencing by synthesis, ion-detection sequencing and single molecule-real time. As discussed below, each technique has pros and cons.
The impact of NGS on scientific and medical research has increased sharply as costs have fallen, with the numbers of publications citing this technique increasing exponentially from only six in 2005, when NGS became available, to 3,709 in 2013 (Figure 2). The number of oncology publications involving genomics has also increased sharply over this period (Figure 3).
The key to successful application of NGS to medical research involves a thorough understanding of the capabilities of the various platforms. For example, sequencing by synthesis offers throughput of as many as 300 million reads taking approximately 27 hours per run. Depending on chemistry utilized, high quality reads of 150 base pairs in both orientations of the nucleic acid may be obtained. Ion based sequencing provides fewer reads per day – around 160 million – but works with longer stretches of nucleic acids (250 base pairs) and takes just four hours per instrument run. Single molecule–real time sequencing is the quickest sequencing platform, taking only 0.5–3 hours per run and sequencing the longest reads (up to 15,000 base pairs), but only delivers two million reads per day. Other important considerations are the time needed for sample preparation, bioinformatics infrastructure requirements, and the instrument’s footprint size. The required applications need to be matched with appropriate sample preparation and NGS technology platforms. Looking ahead, with rapid evolution of this technology, sourcing these processes to providers will continue on the path to become more cost-effective and less risky.