Some 3.5 billion years ago, life on Earth evolved to have just four “letters” in its genetic code. These letters are the DNA bases G, C, A and T—and they spell out the instructions for making proteins in every organism on Earth.
But scientists in a lab at The Scripps Research Institute (TSRI) have been working on something new. They’ve designed a bacterium with two unnatural bases, called X and Y, which could someday help them produce new molecules for medical therapies.
In a study published today in the journal Nature, the researchers announced that their “semi-synthetic” strain of E. coli is the first to both contain the unnatural bases in its DNA and use the bases to instruct cells to make a new protein.
“I would not call this a new lifeform—but it’s the closest thing anyone has ever made,” said TSRI Professor Floyd Romesberg, Ph.D., who led the study. “This is the first time ever a cell has translated a protein using something other than G, C, A or T.”
The new research builds on the Romesberg Lab’s previous efforts to expand the limited “alphabet” of natural DNA. Until now all organisms use have used only the four DNA bases to code for 20 amino acids. With the addition of X and Y, an organism could code for up to 152 new amino acids. The researchers hope these amino acids could become building blocks for new medicines.
Synthorx, Inc., founded on research from the Romesberg Lab, is leading the effort to develop protein therapeutics based on X and Y.
Cells Can Decode New Bases to Make Protein
Romesberg and his team worked toward this breakthrough for 20 years. Their research took a huge step forward in 2014, when the team announced the creation of a semi-synthetic organism that could copy X and Y in its DNA. Earlier this year, the researchers also found that they could get bacteria to stably store the information and pass on the unnatural bases to daughter cells as they divide.
But just storing these bases isn’t enough. To really be useful, these bases need to be “read,” or transcribed, into RNA molecules and translated into proteins.
Romesberg and his colleagues achieved these important steps by embedding their unnatural bases in genes that also contained A, C, G and T. They found that within the semi-synthetic organism these genes could be successfully transcribed into RNA molecules also containing the unnatural bases, and that the cells could use these RNA molecules at their ribosomes to direct the incorporation of unnatural amino acids into proteins.
The protein produced in this process was a variant of green fluorescent protein (GFP), a naturally glowing marker often used in genetic experiments, which contained different unnatural amino acids incorporated at a selected site.
“This was the smallest possible change we could make to the way life works—but it is the first ever,” said Romesberg.
Testing Hydrogen Bonds
The study is also the first to show that hydrogen bonds—thought to be crucial to the DNA-decoding process—may not be as important as scientists thought. “It would be very easy to say complementary hydrogen bonds are what define DNA and RNA,” said Romesberg. “But we’ve found that forces other than hydrogen bonding can productively participate in every step of information storage and retrieval.”
The scientists designed part of X and Y to be hydrophobic—so that they only pair with each other and repel the usual hydrogen bonding natural bases, which keeps X and Y from accidentally pairing with A, T, C or G.
It turns out that a lack of complementary hydrogen bonds doesn’t really bother cells. As the scientists found, X and Y were successfully transcribed and translated anyway.
“What is remarkable about our findings is not just the fact that the cells are able to transcribe and translate these hydrophobic unnatural bases, but that they do so very efficiently,” said TSRI Graduate Student Yorke Zhang, first author of the study. “We were able to achieve purities of desired amino acid incorporation above 98 percent, which demonstrates how seamlessly our synthetic bases can be integrated into the natural processes for encoding and decoding genetic information.”
This finding shows that perhaps the forces nature uses aren’t the only ones possible. “It’s very hard to ask questions about the origins of life. It’s hard to ask questions about why we are the way we are, why we are built the way we are, because we have nothing out there to compare ourselves to,” Romesberg said. “We’ve now given the field a comparison. It’s a small step, but it’s the first successful step.”
One reason this is just a small step is because every X or Y had to be sandwiched between natural bases during replication. While Romesberg said that the scientists are unlikely to be able to replicate long stretches of only unnatural bases, he emphasized that the bases in this study weren’t just along for the ride. “These hydrophobic forces aren’t just bystanders either—they are actively participating,” he said.
The scientists added that it is impossible for this semi-synthetic organism to live outside the lab, as no lifeform can produce its own X and Y without scientists adding the right chemicals.
In addition to Romesberg and Zhang, authors of the study, “A Semi-Synthetic Organism that Stores and Retrieves Increased Genetic Information,” were Emil C. Fischer and Aaron W. Feldman of TSRI; and Kristine San Jose, Carolina E. Caffaro, Jerod L. Ptacin, Hans R. Aerni and Court R. Turner of Synthorx, Inc.
The study was supported by the National Institutes of Health (grants GM060005 and GM118178) and a National Science Foundation Graduate Research Fellowship (grant NSF/DGE-1346837).