Why some genes are more error-prone: Scientists uncover hidden rule in DNA transcription

Study offers map for improving synthetic and therapeutic RNA

Anatoly Kolomeisky

Every living cell must interpret its genetic code — a sequence of chemical letters that governs countless cellular functions. A new study by researchers from the Center for Theoretical Biological Physics at Rice University has uncovered the mechanism by which the identity of the letters following a given nucleotide in DNA affects the likelihood of mistakes during transcription, the process by which DNA is copied into RNA. The discovery offers new insight into hidden factors that influence transcription accuracy.

anatoly kolomeisky
The study, authored by Tripti Midha, Anatoly Kolomeisky and Oleg Igoshin and published in the Proceedings of the National Academy of Sciences. Photo by Jeff Fitlow/Rice University.

The study, authored by Tripti Midha, Anatoly Kolomeisky and Oleg Igoshin and published in the Proceedings of the National Academy of Sciences on July 9, shows why genetic sequences are not equally prone to errors. Instead, the identity of the two nucleotides immediately downstream of a site significantly alters the error rate during transcription. This discovery builds on the prior insights by the same authors on enzymatic proofreading mechanisms, factoring in the effects of distinct kinetics for different nucleotide additions.

“It’s not just the letter itself that matters but its downstream neighbors,” said Igoshin, professor of bioengineering, chemistry and biosciences.

Kinetic speed and sequence dependence

Cells rely on RNA polymerases to transcribe DNA into RNA with high fidelity. Although error rates are generally low, occasional mistakes can disrupt protein function or regulation. Until now, the mechanisms of how the local DNA context affects these errors have not been well understood.

The research team developed a theoretical framework that links transcription fidelity to the speed of nucleotide incorporation. Their model indicates that faster-incorporating bases, such as adenine (A) and guanine (G), reduce the time available for error-correction (proofreading), thus increasing error rates. In contrast, slower-incorporating bases, such as cytosine (C) and uracil (U), allow for more time to correct errors.

The researchers tested their model against a suite of recently published experimental datasets and found strong agreement across various genomic contexts.

“The kinetic principles we developed can predict regions where errors are likely to occur, expanding on previous models of transcription fidelity that did not uncover the long-range sequence dependence,” Igoshin said.

Implications for genetic disease risk

To understand the implications, the study focused on the BRCA1 gene, which plays a critical role in preventing breast and ovarian cancer. By analyzing the nucleotide sequence of BRCA1, the team discovered that the sequence dependence of errors affects the likelihood of premature stop codons. A premature stop codon can truncate the BRCA1 protein, impairing its function in DNA repair and elevating cancer risk.

Elevated rates of premature termination caused by sequence-dependent transcriptional errors in important genes like BRCA1 reveal a previously unrecognized layer of genetic vulnerability, deepening our understanding of disease mechanisms and inherited risk, said Kolomeisky, professor of chemistry.

“Researchers now have a tool to better map and predict where harmful transcription errors might occur,” Kolomeisky said.

Toward predictive and preventive strategies

By clarifying how the DNA sequence affects transcription accuracy, the study offers a new perspective on its fidelity, suggesting that errors are not in random locations but are instead influenced by the kinetic rates for nucleotides

“Biotechnologists could use this model to engineer gene sequences with inherently lower error rates, potentially improving the error-free fractions of synthetic and therapeutic RNA,” said Midha, postdoctoral fellow at the Center for Theoretical Biological Physics and first author of the study.

This research was supported by the National Science Foundation and the Welch Foundation.

Body