A huge catalog of human DNA is helping researchers find tiny glitches that cause disease, in part by pointing out some false leads.
The database, with genetic codes from more than 60,000 people, is aimed at researching rare diseases that are generally caused by a single malfunctioning gene. Most of these diseases are so uncommon that the general public has never heard of them, but there are thousands of such conditions, and as a group they affect about 1 percent of births.
Better accuracy in identifying the genetic cause of a person’s disease provides a “clear and direct benefit to patients,” said Daniel MacArthur of the Broad Institute in Cambridge, Massachusetts, and Massachusetts General Hospital in Boston.
He is senior author of an analysis published Wednesday in the journal Nature by researchers who compiled the database, which draws on DNA data from more than two dozen disease studies. It went online in 2014 and has since been consulted more than 5 million times, he said.
For rare diseases, doctors try to find the genetic cause by analyzing the patient’s DNA. But everybody carries tens of thousands of minute differences from the standard DNA code, and the goal is to find which one or two of them is making the person sick.
Researchers frequently do that through guilt by association. If a variation shows up in a patient but it is never seen or extremely rare in others, it may be fingered as the cause of disease.
The challenge is getting enough DNA from the general public. If the sampling is too small or not comprehensive enough for diverse populations, a variant may be wrongly blamed as the cause of the patient’s problems. A better sampling might show the variant actually appears in healthy people often enough that it’s clearly not making anybody sick.
False leads can harm patient care, including in some cases missing out on treatments, MacArthur said.
An example of such a “genetic misdiagnosis” was presented Wednesday in an unrelated study in the New England Journal of Medicine. It focused on an inherited disease called hypertrophic cardiomyopathy, which thickens heart muscle and can interfere with pumping blood.
Examining three years of records from a testing lab, it found that seven patients were told they carried one of two DNA variants that had been linked to the heart disease. Both variants were later reclassified as benign.
At least five of the patients were of African ancestry. If the original studies of those variants had included enough black Americans in their samples, they probably would not have reached the wrong conclusion, the Harvard researchers said.
They said the new DNA catalog, dubbed ExAC, is well-equipped to avoid such errors. It provides a far more comprehensive collection of DNA variations than has been available in the past. The roughly 10 million tiny variations listed are from people of European, African, South Asian, East Asian and Latino ancestry.
In a separate study, released Wednesday by the journal Genetics in Medicine, British authors used ExAC data to cast doubt on recent reports that implicated some genes in hypertrophic cardiomyopathy and a second heart muscle disease.
The analysis in Nature provided another example of how ExAC can point out past mistakes. Researchers looked at 192 DNA variations that had been linked to diseases, but which ExAC showed were actually so common in the general population that those links appeared spurious.
The researchers found that under standard criteria, at least 163 should be reclassified as benign or probably benign. As of last December, 126 had been reclassified, the researchers reported. That probably reflects the influence of the ExAC database, MacArthur said
But databases even bigger than ExAC will be needed to decisively link genetic variations to disease or rule them out, he said. “We have a long way to go,” he said.
Jay Shendure, a genetics expert at the University of Washington in Seattle who didn’t participate in the new work, said the ExAC catalog offers advantages over previous databases. There’s “little doubt” that it will accelerate and refine the search for disease causes and genetic aspects of patient care, he wrote in a Nature commentary.