The paper describes the development of
Protein2PAM, a deep learning framework designed to predict and engineer the
protospacer-adjacent motif (PAM) specificity of
CRISPR–Cas enzymes. By training on a massive dataset of over
45,000 sequences known as the
CRISPR–Cas Atlas, the model identifies critical protein-DNA interactions without requiring complex structural data. Researchers used this tool to perform
in silico mutagenesis, successfully designing
Nme1Cas9 variants with customized or broadened recognition capabilities. These engineered enzymes achieved up to a
50-fold increase in cleavage rates compared to wild-type versions, significantly improving
genomic targeting flexibility. Ultimately, this machine learning approach offers a rapid, scalable alternative to labor-intensive experimental methods for optimizing
gene editing tools.
References:
- Nayfach S, Bhatnagar A, Novichkov A, et al. Customizing CRISPR–Cas PAM specificity with protein language models[J]. Nature Biotechnology, 2026: 1-10.