Integrative Structure Prediction of Biomolecules and Their Complexes

发布时间2025-11-13文章来源 上海科技大学作者责任编辑系统管理员

Deep learning has revolutionized protein structure prediction, yet challenges remain in modeling difficult proteins with few homologs and multi-domain complexes. To address this, we developed DeepMSA2, a pipeline that constructs unified multiple sequence alignments through iterative genomic database searches. These MSAs significantly enhance prediction accuracy for both monomers and complexes. Building on DeepMSA2, we created D-I-TASSER, which integrates deep learning constraints with statistical energy functions. Benchmarking reveals D-I-TASSER outperforms AlphaFold2/3 in single and multi-domain prediction, enabling high-accuracy modeling for 81% of human proteome domains and 73% of full-length sequences. For complexes, DMFold, which integrats DeepMSA2 with AlphaFold2, excels in CASP15 tasks and large assembly modeling. Our methods, ranked top in CASP13-16, have demonstrated practical utility in antibody screening against monkeypox virus, structural determination of hemorrhagic fever virus nucleoprotein, and nanobody selection for pathogen detection. These advances expand protein structure prediction's applications in disease research, drug development, and public health, paving the way for broader interdisciplinary innovation.