Researchers tune in to protein pairs

Rice University team quantifies how mutations affect cell signaling in bacteria

Rice University scientists have created a way to interpret interactions among pairs of task-oriented proteins that relay signals. The goal is to learn how the proteins avoid crosstalk and whether they can be tuned for better performance.

Each cell contains thousands of these two-component signaling proteins, which often act as sensors and trigger the cell to act.


Rice University researchers have discovered a computational method to evaluate the effects of single mutations in the genomic sequences responsible for the ability of two-component signaling proteins to communicate. At top, a mutation enhances the signal transfer, giving the protein pair derived from a bacteria a positive Direct Information Score (DIS). At bottom, a different mutation to the same sequence results in little or no signal, and a negative DIS. Click on image for larger version. Graphic by Ryan Cheng and Faruck Morcos

The new research has significance for bioengineers who try to understand and modify signaling pathways to treat disease or carry out tasks. A paper on the research done at the Center for Theoretical Biological Physics (CTBP) at Rice’s BioScience Research Collaborative appears online this month in the Proceedings of the National Academy of Sciences.

The research team led by Rice physicist José Onuchic and bioengineer Herbert Levine used the predictive power of their pioneering direct coupling analysis statistical method to compare the genomic roots of thousands of protein pairs collected from many different bacterial organisms. Their product is a new metric to judge how mutations affect the way the pairs work.

Two-component systems are usually disconnected proteins that transmit signals to trigger many types of actions within cells. In bacteria, for instance, the first component is a histidine kinase (HK) that senses conditions outside a cell and triggers the creation of a signal in a process called “autophosphorylation.” The signal (that is, the phosphoryl group) generated on the kinase can then be passed to a response regulator (RR) protein, akin to a baton in a relay race. The regulator takes the baton and then generates a physiological response through the activation or repression of genes.

A central question for the researchers was how these kinases and regulators coevolve to recognize each other in a crowd, and why there’s so little crosstalk leading to mismatched pairs.

“If we are going to figure out how cells are able to compute, we need to understand the specificity for interactions between different biomolecules,” Onuchic said. “In the specific case of this paper, that would be why a particular kinase interacts to one response regulator much more than another one.

“This is a real challenge, since all these response regulators are very similar and the same chemistry takes place. The molecular/genomic approach is creating the tools to quantitatively understand these processes.”

Because there can be tens of thousands of distinct signaling proteins in a single bacterium, this kind of analysis has been nearly impossible until recent times. First, a database of enough sequences of these HK/RR pairings had to be available to make a survey valid. Second, the researchers needed enough computational power to make sense of them.

The acceleration in recent years of genome sequencing gave Ryan Cheng and Faruck Morcos, both Rice postdoctoral researchers and lead authors of the study, plenty of raw material to work with as they chose several families of bacterial signaling systems to analyze. Rice’s DaVinCI and SUGAR supercomputing systems were instrumental in proving their Direct Information Score (DIS) method of quantifying the effects of mutations on the signaling capability of protein pairs.

“Nature uses the same template for all these proteins, so their structures are very similar,” Morcos said, “but they still manage to avoid crosstalk. There’s something in the interfaces and the way they interact that allows them to recognize each other.”

“We were motivated to address the question of how these proteins interact with their partners and not some other signaling network,” Cheng added.

They applied DIS to the genomic data of known signaling partners for bacillus subtillis, bacteria found both in soil and in humans that sense environmental distress and control the process known as sporulation, by which the mother cell dies and forms spores. They also applied it to 22 versions with mutations that altered the proteins’ signaling process.

Rice researchers have created a method to analyze two-component signaling proteins and changes to their ability to communicate through mutations. Clockwise from left, postdoctoral researchers Faruck Morcos and Ryan Cheng and professors Herbert Levine and José Onuchic. Courtesy of the Center for Theoretical Biological Physics

The DIS scores that resulted from their simulations showed the degree to which sporulation was either enhanced or decreased, depending on the specific mutation. Enhanced sporulation gave the mutants a positive DIS; decreased sporulation gave them a negative DIS. (Non-mutant, “wild-type” signaling partners were given a DIS of 0.) The scores, they found, attached specific numbers to the results of previous experiments by others.

“We can take any kinase and response regulator sequence and compute a score for it,” Cheng said. “We argue that the higher the score for a new mutant, the more effectively it will interact with its partner. Negative scores hurt the interaction – and sometimes that’s what you want to happen.”

After confirming similar experiments with genomic data from Escherichia coli protein pairs, Morcos and Cheng turned the tables. In new simulations, they scrambled thousands of protein pairs to make their matching capabilities truly random. Those that still found a relay partner led them to build a “null” DIS.

“The null model is a different concept,” Cheng said. “We were looking for very general properties of signaling, and we scrambled the proteins so we no longer had any information as to whether they were paired. They could even be from other organisms.”

“If the results showed us a signal, then it had to be a general signal about the binding mode that all pairs share,” Morcos said. He explained that subtracting those generalities from known binding pairs leaves sequence information that is specific only to those pairs. “We end up with sequence data that only refers to the two proteins that detect or recognize each other,” he said. That should allow experimentalists to improve their ability to make mutations for specific purposes, he noted.

“All of this is very useful for people who want to re-engineer bacterial systems,” said Cheng.  Long-term potential uses include engineering proteins to design biosensors and to better break down contaminants at the site of an environmental spill, he said.

“It has been a challenge to use the molecular biophysics of specific proteins to predict how mutations in those proteins change the cellular phenotype,” Levine said. “Here we succeed in showing that the intelligent combination of physical simulation and informatics-based methods can crack this tough problem.”

The National Science Foundation (NSF) and the Cancer Prevention and Research Institute of Texas supported the research.

Onuchic is Rice’s Harry C. and Olga K. Wiess Professor of Physics and Astronomy. Levine is the Karl F. Hasselmann Professor in Bioengineering. Onuchic and Levine are co-directors of the CTBP. The researchers utilized the Data Analysis and Visualization Cyberinfrastructure (DAVinCI) supercomputer supported by the NSF and the Shared University Grid at Rice (SUGAR) supercomputer, both administered by Rice’s Ken Kennedy Institute for Information Technology.

About Mike Williams

Mike Williams is a senior media relations specialist in Rice University's Office of Public Affairs.