Accurate and Rapid Ranking of Protein–Ligand Binding Affinities Using Density Matrix Fragmentation and Physics‐Informed Machine Learning Dispersion Potentials
Von Wiley-VCH zur Verfügung gestellt
Two efficient methods, generalized many-body expansion for building density matrices (GMBE-DM) and D3-ML, are introduced for ranking protein–ligand binding affinities. GMBE-DM delivers quantum-accurate results within minutes, while D3-ML achieves even higher accuracy in under one second per complex. Both methods show strong correlation with experimental data, enabling fast and scalable applications in drug discovery.
The generalized many-body expansion for building density matrices (GMBE-DM), truncated at the one-body level and combined with a purification scheme, is applied to rank protein–ligand binding affinities across two cyclin-dependent kinase 2 (CDK2) datasets and one Janus kinase 1 (JAK1) dataset, totaling 28 ligands. This quantum fragmentation-based method achieves strong correlation with experimental binding free energies (R 2 = 0.84), while requiring less than 5 min per complex without extensive parallelization, making it highly efficient for rapid drug screening and lead prioritization. In addition, our physics-informed, machine learning-corrected dispersion potential (D3-ML) demonstrates even stronger ranking performance (R 2 = 0.87), effectively capturing binding trends through favorable cancelation of non-dispersion, solvation, and entropic contributions, emphasizing the central role of dispersion interactions in protein–ligand binding. With sub-second runtime per complex, D3-ML offers exceptional speed and accuracy, making it ideally suited for high-throughput virtual screening. By comparison, the deep learning model Sfcnn shows lower transferability across datasets (R 2 = 0.57), highlighting the limitations of broadly trained neural networks in chemically diverse systems. Together, these results establish GMBE-DM and D3-ML as robust and scalable tools for protein–ligand affinity ranking, with D3-ML emerging as a particularly promising candidate for large-scale applications in drug discovery.




