This notebook implements a typical protocol for docking ligands to a target protein. It uses RDKit (http://www.rdkit.org) to generate a number of reasonable conformations for each ligand and then uses SMINA (https://sourceforge.net/projects/smina/) to do the docking. Two methods of docking are implemented, the first docks into a rigid receptor, the second sets the protein side-chains around the active site to be flexible. Bear in mind flexible docking will be much, much slower. In the optional final step the resulting docked poses are rescored using a random forest model described in https://www.nature.com/articles/srep46710.
import sys
from collections import defaultdict
import numpy as np
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import PandasTools
import pandas as pd
IPythonConsole.ipython_3d=True
%pylab inline
import py3Dmol
Populating the interactive namespace from numpy and matplotlib
First we need get the location of the input file of structures you want to dock, replace "asinexSelection.sdf" with your file. You may want to rename the output file for conformations, and the output file containing the docked structures.
The sdf file needs to have the name included in the first line of each molecule record.
AEM 10028511
MOE2019 2D
22 24 0 0 0 0 0 0 0 0999 V2000 7.2040 -6.7290 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3790 -6.7290 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0</code>
# File locations
sdfFilePath = 'asinexSelectionexport.sdf' # The input file of structures to generate conformations from
ConfoutputFilePath = 'asinexSelectionForDocking.sdf' # Output file containing conformations for docking
inputMols = [x for x in Chem.SDMolSupplier(sdfFilePath,removeHs=False)]
# Assign atomic chirality based on the structures:
len(inputMols) # Check how many strucures
10
#Check that all molecules have a name
for i, mol in enumerate(inputMols):
if mol is None:
print('Warning: Failed to read molecule %s in %s' % (i, sdfFilePath))
if not mol.GetProp('_Name'):
print('Warning: No name for molecule %s in %s' % (i, sdfFilePath))
We next generate conformations, this uses paralellisation code from http://www.rdkit.org/docs/Cookbook.html contributed by Andrew Dalke. We don't use all cores on a desktop machine or it might be unresponsive. If running on a cluster you should modify this.
import multiprocessing
# Download this from http://pypi.python.org/pypi/futures
from concurrent import futures
# conda install progressbar
import progressbar
#Find number cores available, leave two or system might be unresponsive
numcores = multiprocessing.cpu_count()
max_workers = numcores -2
#Knowledge based torsion generator http://pubs.acs.org/doi/abs/10.1021/acs.jcim.5b00654
# This function is called in the subprocess.
# The parameters (molecule and number of conformers) are passed via a Python
ps = AllChem.ETKDG()
ps.pruneRmsThresh=0.5
ps.numThreads=0
#Edit for number of confs desired eg n = 5
n=5
def generateconformations(m, n, name):
m = Chem.AddHs(m)
ids=AllChem.EmbedMultipleConfs(m, n, ps)
for id in ids:
AllChem.UFFOptimizeMolecule(m, confId=id)
# EmbedMultipleConfs returns a Boost-wrapped type which
# cannot be pickled. Convert it to a Python list, which can.
return m, list(ids), name
smi_input_file, sdf_output_file = sys.argv[1:3]
writer = Chem.SDWriter(ConfoutputFilePath)
# suppl = [x for x in Chem.SDMolSupplier(sdfFilePath,removeHs=False)]
#suppl = Chem.SmilesMolSupplier(smi_input_file, titleLine=False)
# for mol in suppl:
# print(mol.GetPropsAsDict(includePrivate=True).get('_Name'))
with futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
# Submit a set of asynchronous jobs
jobs = []
for mol in inputMols:
if mol:
name = mol.GetProp('_Name')
job = executor.submit(generateconformations, mol, n, name)
jobs.append(job)
widgets = ["Generating conformations; ", progressbar.Percentage(), " ",
progressbar.ETA(), " ", progressbar.Bar()]
pbar = progressbar.ProgressBar(widgets=widgets, maxval=len(jobs))
for job in pbar(futures.as_completed(jobs)):
mol, ids, name = job.result()
mol.SetProp('_Name', name)
for id in ids:
writer.write(mol, confId=id)
writer.close()
Generating conformations; 100% Time: 0:00:00 |################################|
ms = [x for x in Chem.SDMolSupplier(ConfoutputFilePath,removeHs=False)]
# Assign atomic chirality based on the structures:
for m in ms: Chem.AssignAtomChiralTagsFromStructure(m)
len(ms) # check how many conformations
43
After generating the conformations we can now do the docking. In this example we use smina which can be downloaded from https://sourceforge.net/projects/smina/ you will need to know where smina has been installed. The protein and ligand examples provided are taken from https://fragalysis.diamond.ac.uk/viewer/react/preview/target/MURD MURD-x0373.
Docking using smina
Need protein minus the ligand in pdb format,
the ligand extracted from binding site in pdb format,
Conformations to be docked as sdf from conformation generation above
DockedFilePath = 'All_Docked.sdf.gz' is the File for the Docked structures
ProteinForDocking = 'protein_minus_ligand.pdb'
LigandFromProtein = '373ligand_only.pdb'
DockedFilePath = 'All_Docked.sdf.gz'
FlexibleDockedFilePath = 'FlexDocked.sdf.gz'
!'/usr/local/bin/smina.osx' --cpu 10 --seed 0 --autobox_ligand '{LigandFromProtein}' -r '{ProteinForDocking}' -l '{ConfoutputFilePath}' -o '{DockedFilePath}'
_______ _______ _________ _ _______
( ____ \( )\__ __/( ( /|( ___ )
| ( \/| () () | ) ( | \ ( || ( ) |
| (_____ | || || | | | | \ | || (___) |
(_____ )| |(_)| | | | | (\ \) || ___ |
) || | | | | | | | \ || ( ) |
/\____) || ) ( |___) (___| ) \ || ) ( |
\_______)|/ \|\_______/|/ )_)|/ \|
smina is based off AutoDock Vina. Please cite appropriately.
Weights Terms
-0.035579 gauss(o=0,_w=0.5,_c=8)
-0.005156 gauss(o=3,_w=2,_c=8)
0.840245 repulsion(o=0,_c=8)
-0.035069 hydrophobic(g=0.5,_b=1.5,_c=8)
-0.587439 non_dir_h_bond(g=-0.7,_b=0,_c=8)
1.923 num_tors_div
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.6 0.000 0.000
2 -6.4 1.691 6.549
3 -5.2 2.064 6.893
4 -5.1 3.766 6.709
5 -5.0 3.613 6.298
6 -4.8 3.740 6.411
7 -4.6 4.676 8.225
8 -4.6 5.892 8.673
9 -4.5 3.176 5.524
Refine time 7.423
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.7 0.000 0.000
2 -6.3 1.876 6.678
3 -4.9 3.628 6.174
4 -4.9 3.733 6.342
5 -4.8 3.711 6.584
6 -4.5 3.799 6.532
7 -4.4 4.702 8.174
8 -4.3 4.226 7.838
9 -4.1 1.771 6.292
Refine time 7.661
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.5 0.000 0.000
2 -5.5 3.965 6.740
3 -5.2 3.556 6.398
4 -5.2 1.770 6.106
5 -5.2 4.479 7.492
6 -5.1 3.709 6.573
7 -4.9 3.940 6.593
8 -4.8 4.903 8.323
9 -4.7 4.174 9.040
Refine time 7.611
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.4 0.000 0.000
2 -4.6 3.707 7.904
3 -4.6 3.299 6.242
4 -4.4 3.029 7.644
5 -4.4 3.723 6.810
6 -4.2 2.701 3.547
7 -4.2 4.196 8.036
8 -4.1 4.284 6.331
9 -3.9 3.557 7.935
Refine time 8.006
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.3 0.000 0.000
2 -4.9 1.904 3.218
3 -4.4 3.904 5.046
4 -4.3 3.449 4.725
5 -4.2 5.612 8.383
6 -4.1 4.318 5.982
7 -4.1 3.204 6.450
8 -3.9 3.179 6.959
9 -3.9 5.434 8.021
Refine time 8.963
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.2 0.000 0.000
2 -4.8 2.639 6.521
3 -4.6 1.809 2.894
4 -4.3 2.701 4.702
5 -4.3 2.804 4.904
6 -4.2 3.069 5.465
7 -4.0 3.480 5.641
8 -4.0 4.415 8.249
9 -4.0 4.247 6.132
Refine time 8.820
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.1 0.000 0.000
2 -4.3 3.937 4.927
3 -4.3 3.455 4.618
4 -4.2 4.211 6.015
5 -4.0 4.521 5.983
6 -3.9 5.542 8.312
7 -3.9 3.130 6.166
8 -3.8 3.191 6.780
9 -3.5 4.205 5.721
Refine time 8.783
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.1 0.000 0.000
2 -4.3 3.902 5.022
3 -4.2 4.219 5.998
4 -4.1 4.430 5.675
5 -4.1 2.654 3.833
6 -4.0 1.885 3.109
7 -3.9 5.564 8.342
8 -3.9 3.222 6.896
9 -3.9 3.142 6.161
Refine time 9.297
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.6 0.000 0.000
2 -6.4 2.262 7.803
3 -5.9 3.647 4.488
4 -5.9 3.091 7.882
5 -5.6 2.940 7.976
6 -5.3 4.377 8.572
7 -5.1 2.801 8.087
8 -5.1 4.518 6.633
9 -5.0 7.305 10.294
Refine time 8.994
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.5 0.000 0.000
2 -5.4 1.923 2.555
3 -5.3 1.492 1.565
4 -4.9 1.537 2.259
5 -4.9 3.787 4.803
6 -4.7 4.300 6.234
7 -4.7 2.877 8.280
8 -4.6 4.159 5.546
9 -4.6 4.707 5.818
Refine time 8.440
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.6 0.000 0.000
2 -5.6 1.408 1.814
3 -5.5 0.044 1.083
4 -5.5 2.205 2.994
5 -5.5 2.177 2.807
6 -5.0 4.199 6.010
7 -4.8 2.593 3.914
8 -4.7 4.627 5.658
9 -4.7 4.181 5.584
Refine time 8.045
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.8 0.000 0.000
2 -5.5 2.150 3.452
3 -5.3 4.201 8.009
4 -5.3 2.249 3.595
5 -4.9 3.191 4.088
6 -4.8 1.074 1.280
7 -4.7 5.563 6.404
8 -4.5 2.441 3.553
9 -4.5 3.472 6.748
Refine time 8.813
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.5 0.000 0.000
2 -5.3 1.458 2.016
3 -5.3 2.002 2.718
4 -4.9 1.647 2.567
5 -4.9 2.086 3.031
6 -4.8 4.019 5.463
7 -4.8 2.472 3.756
8 -4.7 2.506 3.984
9 -4.6 4.144 5.995
Refine time 9.316
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -4.0 0.000 0.000
2 -3.7 2.331 4.470
3 -3.7 2.490 4.267
4 -3.6 2.224 5.213
5 -3.4 2.288 5.585
6 -3.3 2.925 4.924
7 -3.3 2.217 4.151
8 -3.3 2.772 4.609
9 -3.3 3.480 5.495
Refine time 10.502
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.1 0.000 0.000
2 -4.7 3.888 6.146
3 -4.4 3.805 7.865
4 -4.3 3.856 7.945
5 -4.1 5.802 8.541
6 -4.1 4.510 8.611
7 -4.0 4.410 6.761
8 -3.9 3.877 5.808
9 -3.8 4.090 7.813
Refine time 10.866
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -3.9 0.000 0.000
2 -3.8 2.012 5.782
3 -3.7 2.442 5.258
4 -3.6 2.804 4.802
5 -3.6 2.335 4.407
6 -3.5 3.883 5.980
7 -3.4 2.337 5.880
8 -3.4 2.097 5.783
9 -3.3 2.753 4.622
Refine time 10.721
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -3.9 0.000 0.000
2 -3.7 3.478 6.514
3 -3.6 1.881 5.596
4 -3.5 3.624 7.243
5 -3.3 5.842 8.953
6 -3.2 1.218 2.258
7 -3.1 2.803 6.151
8 -3.0 3.761 7.265
9 -3.0 1.693 2.835
Refine time 11.194
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -3.9 0.000 0.000
2 -3.8 2.433 4.703
3 -3.7 2.099 5.214
4 -3.5 3.085 5.900
5 -3.5 3.197 4.969
6 -3.5 2.215 5.389
7 -3.4 2.376 4.058
8 -3.4 2.470 4.285
9 -3.3 2.243 4.548
Refine time 11.177
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -4.8 0.000 0.000
2 -4.8 0.096 2.825
3 -4.5 5.037 8.390
4 -4.4 4.278 6.491
5 -4.4 4.527 7.232
6 -4.3 3.878 6.665
7 -4.2 3.935 6.812
8 -4.1 4.658 7.202
9 -4.0 6.043 9.424
Refine time 13.318
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.5 0.000 0.000
2 -5.4 0.099 2.948
3 -4.2 3.680 6.965
4 -4.0 3.791 7.178
5 -3.9 5.718 7.478
6 -3.9 4.339 7.655
7 -3.9 4.728 6.965
8 -3.9 6.445 8.935
9 -3.9 6.369 7.408
Refine time 14.264
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -4.8 0.000 0.000
2 -4.7 3.826 6.535
3 -4.6 3.832 6.610
4 -4.5 4.368 6.895
5 -4.5 4.473 7.248
6 -4.3 4.095 6.250
7 -4.2 5.528 8.424
8 -4.2 5.509 8.377
9 -4.1 3.854 6.149
Refine time 13.555
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.0 0.000 0.000
2 -4.9 0.399 2.807
3 -4.8 3.834 6.947
4 -4.6 3.773 7.379
5 -4.3 3.684 6.460
6 -3.9 3.556 6.813
7 -3.7 4.060 7.817
8 -3.7 4.289 8.313
9 -3.3 4.349 7.594
Refine time 14.435
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.3 0.000 0.000
2 -5.0 2.021 3.148
3 -4.8 1.431 1.862
4 -4.8 3.316 4.530
5 -4.7 1.860 2.884
6 -4.6 2.737 4.031
7 -4.5 1.948 2.547
8 -4.1 3.735 6.157
9 -4.0 3.349 5.121
Refine time 2.640
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.9 0.000 0.000
2 -5.8 2.538 7.374
3 -5.7 2.830 7.585
4 -5.7 1.955 2.747
5 -5.6 2.190 3.060
6 -5.2 3.447 8.188
7 -5.1 1.458 7.546
8 -5.1 2.641 4.252
9 -4.8 2.932 5.756
Refine time 16.686
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.2 0.000 0.000
2 -6.1 2.342 3.158
3 -6.0 2.784 4.461
4 -5.9 2.237 8.958
5 -5.9 2.102 8.979
6 -5.7 2.973 8.601
7 -5.4 3.626 8.572
8 -5.4 2.278 2.786
9 -5.3 1.378 2.287
Refine time 15.911
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.1 0.000 0.000
2 -5.8 1.890 2.161
3 -5.8 2.582 3.757
4 -5.6 2.349 8.772
5 -5.4 3.179 8.866
6 -5.3 3.129 8.301
7 -5.1 1.215 2.177
8 -5.0 3.947 8.235
9 -4.9 2.979 8.857
Refine time 18.045
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.7 0.000 0.000
2 -6.4 1.929 2.166
3 -6.1 2.612 3.310
4 -6.0 3.095 4.498
5 -6.0 2.352 8.926
6 -5.8 2.419 3.690
7 -5.6 3.461 8.996
8 -5.6 4.207 6.145
9 -5.4 2.986 9.026
Refine time 16.864
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.5 0.000 0.000
2 -6.3 2.260 9.007
3 -6.0 2.010 8.877
4 -5.4 2.636 9.309
5 -5.3 2.775 9.131
6 -5.3 2.249 2.909
7 -5.2 3.493 9.264
8 -5.2 1.208 2.035
9 -5.2 3.028 4.741
Refine time 16.226
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.4 0.000 0.000
2 -4.2 1.346 2.092
3 -4.1 2.875 5.018
4 -3.7 2.289 3.732
5 -3.6 4.313 6.113
Refine time 11.858
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.4 0.000 0.000
2 -5.7 0.853 1.697
3 -5.0 2.074 2.759
4 -4.4 5.210 7.187
5 -4.1 4.804 6.798
6 -3.9 8.163 9.391
7 -3.8 8.047 9.344
8 -3.7 3.877 5.947
9 -3.6 7.624 8.890
Refine time 11.817
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.8 0.000 0.000
2 -6.5 1.645 1.984
3 -5.2 3.516 7.885
4 -4.7 2.297 2.809
5 -4.5 3.515 7.783
6 -4.3 4.411 6.442
7 -4.1 3.623 7.777
8 -4.0 3.549 7.870
9 -4.0 7.143 8.364
Refine time 12.829
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.3 0.000 0.000
2 -5.1 2.145 2.485
3 -4.7 5.468 6.890
4 -4.5 3.190 5.280
5 -4.4 3.862 5.724
6 -4.0 3.690 5.439
7 -3.6 4.824 6.870
8 -3.6 5.174 7.539
9 -3.4 5.748 7.555
Refine time 11.488
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.8 0.000 0.000
2 -6.2 1.407 2.096
3 -5.8 0.919 1.550
4 -4.8 3.586 8.107
5 -4.5 3.176 7.297
6 -4.2 3.263 5.653
7 -4.0 8.095 9.449
Refine time 11.948
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.6 0.000 0.000
2 -5.3 3.436 7.643
3 -5.2 3.142 7.597
4 -5.0 3.614 8.092
5 -4.7 4.293 7.210
6 -4.6 3.402 7.598
7 -4.6 4.817 7.360
8 -4.5 2.843 7.303
9 -4.5 4.464 8.411
Refine time 12.134
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -6.0 0.000 0.000
2 -5.6 2.847 7.464
3 -5.6 3.379 7.947
4 -5.4 1.687 2.743
5 -4.9 4.924 7.305
6 -4.8 4.097 8.664
7 -4.6 3.247 7.733
8 -4.4 2.870 7.333
9 -4.2 6.987 8.341
Refine time 11.037
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.9 0.000 0.000
2 -5.8 3.356 7.974
3 -5.6 3.444 7.782
4 -5.3 3.180 7.673
5 -5.1 1.839 2.329
6 -5.1 3.874 7.900
7 -5.0 3.792 8.250
8 -5.0 4.176 7.200
9 -4.6 3.088 7.109
Refine time 10.830
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.9 0.000 0.000
2 -5.9 1.873 2.456
3 -5.9 2.736 7.260
4 -5.8 2.875 4.898
5 -5.6 1.514 2.173
6 -5.5 1.306 1.928
7 -5.4 2.903 7.145
8 -5.2 2.662 6.754
9 -4.8 4.631 6.788
Refine time 11.063
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.8 0.000 0.000
2 -5.7 3.193 7.879
3 -5.6 2.008 2.519
4 -5.4 1.817 2.477
5 -5.1 2.540 4.830
6 -4.9 3.336 7.243
7 -4.9 4.218 6.669
8 -4.7 3.703 5.466
9 -4.7 4.511 6.265
Refine time 10.760
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -7.5 0.000 0.000
2 -7.4 3.069 8.903
3 -7.3 1.460 2.155
4 -7.3 3.196 9.082
5 -7.3 1.264 1.755
6 -6.8 3.609 9.044
7 -6.3 2.979 8.871
8 -5.6 2.063 2.826
9 -5.6 4.362 8.302
Refine time 13.736
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -7.3 0.000 0.000
2 -7.2 3.087 8.602
3 -6.8 3.333 4.691
4 -6.7 3.353 4.444
5 -6.5 3.247 4.993
6 -6.4 3.517 5.365
7 -6.4 2.708 8.566
8 -6.4 2.299 3.012
9 -6.2 3.778 8.728
Refine time 15.729
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -7.1 0.000 0.000
2 -7.0 1.462 1.829
3 -6.8 4.202 5.849
4 -6.8 2.874 8.696
5 -6.6 3.292 4.423
6 -6.3 3.568 8.710
7 -6.3 3.063 3.589
8 -6.1 3.797 8.361
9 -5.7 3.835 5.329
Refine time 15.496
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -7.2 0.000 0.000
2 -7.1 1.337 2.098
3 -6.9 2.851 8.675
4 -6.6 2.696 8.442
5 -6.5 3.291 4.634
6 -6.5 3.684 8.290
7 -6.4 1.369 1.904
8 -6.4 1.745 2.343
9 -6.2 3.608 8.574
Refine time 13.342
Using random seed: 0
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -7.5 0.000 0.000
2 -7.4 1.950 3.279
3 -7.3 1.431 2.036
4 -7.0 3.596 8.083
5 -6.6 3.353 8.189
6 -6.6 2.766 8.477
7 -6.5 3.140 7.997
8 -6.3 3.157 8.683
9 -6.3 3.005 7.988
Refine time 13.318
Loop time 516.905
Flexible docking method, set all side chains within specified distance to flexdist_ligand to flexible This will take an order of magnitude longer. Currently disabled, to enable remove the #
#!'/usr/local/bin/smina.osx' --cpu 10 --seed 0 --autobox_ligand '{LigandFromProtein}' --autobox_add 5 -r '{ProteinForDocking}' --flexres A:124,A:132,A:143,A:147,A:311,A:312,A:434,A:435 -l '{ConfoutputFilePath}' -o '{FlexibleDockedFilePath}'
Optional, Rescore using a random forest model described in https://www.nature.com/articles/srep46710
Download from https://github.com/oddt/rfscorevs You will need the path to the binary
Path to protein containing ligand in pdb format
protein_plus_373ligand from Diamond
File to store rescored results
TargetProtein = 'protein_plus_373ligand.pdb'
scoreResults = 'DockedRescored.csv'
!/usr/local/bin/rf-score-vs --receptor '{TargetProtein}' '{DockedFilePath}' -o csv -O '{scoreResults}' --field name --field RFScoreVS_v2
docked_df = PandasTools.LoadSDF(DockedFilePath,molColName='Molecule', removeHs=False)
docked_df.head(n=5)
| minimizedAffinity | ID | Molecule | |
|---|---|---|---|
| 0 | -6.63499 | ASN 13983556 | |
| 1 | -6.39234 | ASN 13983556 | |
| 2 | -5.19878 | ASN 13983556 | |
| 3 | -5.12981 | ASN 13983556 | |
| 4 | -4.96488 | ASN 13983556 |
scores_df = pd.read_csv(scoreResults)
#scores_df.head(n=5)
results_df = pd.concat([docked_df, scores_df], axis=1)
results_df.head(5)
| minimizedAffinity | ID | Molecule | name | RFScoreVS_v2 | |
|---|---|---|---|---|---|
| 0 | -6.63499 | ASN 13983556 | ASN 13983556 | 6.128915 | |
| 1 | -6.39234 | ASN 13983556 | ASN 13983556 | 6.115107 | |
| 2 | -5.19878 | ASN 13983556 | ASN 13983556 | 6.067220 | |
| 3 | -5.12981 | ASN 13983556 | ASN 13983556 | 6.128303 | |
| 4 | -4.96488 | ASN 13983556 | ASN 13983556 | 6.093961 |
Now combine rescored file with docked structure file and export to "Alldata.sdf.gz" this is a big file so export compressed
combinedResults = 'Alldata.sdf.gz'
PandasTools.WriteSDF(results_df, combinedResults, molColName="Molecule", idName="ID", properties=list(results_df.columns))
results_df.sort_values(["RFScoreVS_v2"], axis=0, ascending=False, inplace=True) #or sort by scoring function
results_df.head(5)
| minimizedAffinity | ID | Molecule | name | RFScoreVS_v2 | |
|---|---|---|---|---|---|
| 188 | -4.10178 | ASN 10790639 | ASN 10790639 | 6.274990 | |
| 346 | -7.22396 | AEM 10028511 | AEM 10028511 | 6.253807 | |
| 270 | -4.49270 | ART 13967891 | ART 13967891 | 6.253746 | |
| 348 | -6.68619 | AEM 10028511 | AEM 10028511 | 6.253265 | |
| 338 | -7.34026 | AEM 10028511 | AEM 10028511 | 6.243394 |
results_df.sort_values(["minimizedAffinity"], axis=0, ascending=False, inplace=True) #or sort by minimizedAffinity
results_df.head(5)
| minimizedAffinity | ID | Molecule | name | RFScoreVS_v2 | |
|---|---|---|---|---|---|
| 336 | -7.47479 | AEM 10028511 | AEM 10028511 | 6.205563 | |
| 372 | -7.46385 | AEM 10028511 | AEM 10028511 | 6.171796 | |
| 337 | -7.43328 | AEM 10028511 | AEM 10028511 | 6.175466 | |
| 373 | -7.36845 | AEM 10028511 | AEM 10028511 | 6.203558 | |
| 338 | -7.34026 | AEM 10028511 | AEM 10028511 | 6.243394 |
selectedPose = 'selectedpose.sdf'
selectedPoseH = 'selectedposeH.sdf'
PandasTools.WriteSDF(results_df.head(5), selectedPose, molColName="Molecule", idName="ID", properties=list(results_df.columns))
selecteddocked_df = PandasTools.LoadSDF(selectedPose,molColName='Molecule', removeHs=False)
selecteddocked_df
| minimizedAffinity | name | RFScoreVS_v2 | ID | Molecule | |
|---|---|---|---|---|---|
| 0 | -7.47479 | AEM 10028511 | 6.205563 | AEM 10028511 | |
| 1 | -7.46385 | AEM 10028511 | 6.171796 | AEM 10028511 | |
| 2 | -7.43328 | AEM 10028511 | 6.175466 | AEM 10028511 | |
| 3 | -7.36845 | AEM 10028511 | 6.203558 | AEM 10028511 | |
| 4 | -7.34026 | AEM 10028511 | 6.243394 | AEM 10028511 |
!obabel -isdf 'selectedpose.sdf' -osdf -h -O 'selectedposeH.sdf'
5 molecules converted
mH = Chem.MolFromMolFile(selectedPoseH, removeHs=False) #View first structure, Hydrogens present
mH
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
mols = [m for m in Chem.SDMolSupplier(selectedPoseH)]
def DrawDocking(protein,ligand):
complex_pl = Chem.MolToPDBBlock(protein)
docked_pdb = Chem.MolToPDBBlock(ligand)
viewer = py3Dmol.view(width=800,height=800)
viewer.addModel(complex_pl,'pdb')
viewer.addModel(docked_pdb)
prot = {'resn': ["DMS", "UNL", "SO4", "LIG", "HOH", "Cl"], 'invert': 1} #define prot as all except list
viewer.setStyle(prot,{'cartoon': {'colorscheme':'ssPyMol'}}) # Colour by secondary structure
Lig_373 = {'resn' : 'LIG'} #original ligand in pdb file
MyLig = {'resn':'UNL'} #ligand to be added from docking
viewer.addSurface(py3Dmol.VDW,{'opacity':0.7, 'color': 'white'}, prot)
viewer.setStyle({'resi': '132'}, {'stick': {'colorscheme': 'blueCarbon'}})
viewer.setStyle({'resi': '147'}, {'stick': {'colorscheme': 'blueCarbon'}})
viewer.setStyle({'resi': '311'}, {'stick': {'colorscheme': 'blueCarbon'}})
viewer.setStyle(Lig_373,{'stick':{'colorscheme': 'whiteCarbon','radius':.1}})
viewer.setStyle(MyLig,{'stick':{'colorscheme' : 'greenCarbon'}})
viewer.zoomTo(MyLig)
return viewer
receptor = Chem.MolFromPDBFile(TargetProtein)
DrawDocking(receptor,mols[4])
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<py3Dmol.view at 0x7fb0abbb4cf8>