Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Although these forces are“encoded” in the thousands of known protein structures, “decoding” them is challenging because of the complexity of natural proteins that have evolved for function, not stability. We combined computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for more than 15,000 de novo designed mini proteins, 1000 natural proteins, 10,000 point mutants, and 30,000 negative control sequences. This analysis identified more than 2500 stable designed proteins in four basic folds—a number sufficient to enable us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment and has the potential to transform computational protein design into a data-driven science.
My takeaways:
1. This is the first successful case of researchers designing proteins in a rational method using data driven-science. Going through 4 iterations of design improvements, they were able to identify over 2500 sequences of stable protein assemblies with 4 unique structures. For reference, for a 44 amino acid protein, the total number of possible sequences is 1 with 57 zeros after it.
2. The knowledge gained by the technique they developed is going to be critical to designing therapeutic formulations for various diseases. Additionally, the methods they developed and knowledge gained on the impact of protein structure on stability will allow scientists the ability to create more stable proteins for a wide array of applications in medicine, electronics, sensing, energy, and more.
3. This type of protein design is really only possible with the help of their Hyak Supercomputer at the University of Washington. This means that while the methods they developed are an important first step, there are still many hurdles to designing functional proteins.
Science 2017, Vol 357, p 168-175
David Baker from Department of Biochemistry and Institute for Protein Design at the University of Washington
Funding is from Howard Hughes Medical Institute