Cancer is usually attributed to faulty genes, but growing
evidence from the field of cancer epigenetics indicates a key role for the gene “silencing” proteins that stably turn genes off inside the cell nucleus. A new
study from Rice University and Baylor College of
Medicine (BCM) promises to speed research in the field by rapidly identifying
the genes that epigenetic proteins can target for silencing.
The study, which appears in Nucleic Acids Research, shows how a new computer program called
EpiPredictor can search any genome to identify specific genes affected by epigenetic
proteins. The research includes detailed experimental findings that verify
EpiPredictor’s results. The research was funded in part by the Cancer
Prevention Research Institute of Texas (CPRIT).
“Cancer epigenetics is a new field, and we are still
struggling with the basics,” said lead investigator Jianpeng Ma, professor of bioengineering
at Rice and the Lodwick T. Bolin Professor of Biochemistry at BCM. “It’s
something like a board game. Until now, we’ve understood some of the rules and
seen a few of the pieces, but the game board itself has been mostly blank.
EpiPredictor lets everyone see the board. It really changes things.”
While many cancers have been linked to mutations in the DNA
sequence of particular genes, epigenetic changes do not involve genetic
mutations. Instead, epigenetics allows two cells with identical DNA sequences
to behave in wholly different ways. Epigenetic proteins effectively edit the
genome by turning off genes that are not needed. This editing process is what
allows human beings to have specialized cells—like nerve cells, bone cells and
blood cells—that look and behave differently, even though they share the same
The key epigenetic players in cancer are a family of
proteins called polycomb-group (PcG) proteins. PcGs are found deep inside the
nucleus of cells, in the chamber where DNA is stored. Studies have found abnormally
high levels of PcGs in some of the most aggressive forms of breast and prostate
PcGs are generalists that can be called upon to silence any
one of several hundred to several thousand genes. They are recruited to this
task by polycomb response elements (PREs), segments of DNA that are located
next to the genes the proteins subsequently silence. This is where the playing
board goes blank; though scientists know there are literally hundreds to
thousands of potential PREs in any given genome—including everything from
simple insects to human beings—only a few PREs have ever been found.
“So far, only two PREs have been experimentally verified in
mammals—one in mice and one in humans,” said EpiPredictor creator Jia Zeng, a
BCM postdoctoral research associate. “We suspect there are so many of them, but
finding them has been difficult.”
Zeng, a computer scientist, had no formal biology training
when she joined Ma’s laboratory under a CPRIT-funded training program for
computational cancer research.
“One of the biggest challenges since the completion of the
Human Genome Project has been how to dig useful information out of the enormous
amount of genomic data,” Ma said.
Ma said Zeng’s new method for zeroing in on PRE sequences is
broadly applicable for genomic data mining in areas beyond cancer research.
“Determining the function of a gene based solely on sequence
data is virtually impossible,” Ma said. “Recognizing this, Jia applied some
advanced tools from computer science to create a learning program that could be
trained to look for PRE sequences based upon the scant experimental data that
In tests on the genome of the fruit fly Drosphilia melanogaster, the EpiPredictor program found almost 300
epigenetic target genes. Experimental research by Ma’s longtime collaborator,
BCM biochemist Qinghua Wang, verified that the EpiPredictor predictions were
“We are now working on using the method to scan the human
genome to search for potential genes that play a role in cancer epigenetics,”
said Wang, assistant professor of biochemistry and molecular biology. “We also
hope that others will explore how this new method may help to identify the
location and function of genes beyond the realm of epigenetics.”