A while back I published two scripts that use UniChem a web resource provided by the EBI, a ‘Unified Chemical Identifier’ system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between multiple databases.
Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI: http://dx.doi.org/10.1186/1758-2946-5-3
The first script uses the ChEMBL ID to search for other identifiers, the second script allows more flexible searching using any of the identifiers available within UnicChem. One of the identifiers returned is from the PDBe (Protein Data Bank Europe) and represents the ID of the ligand in the PDB. Whilst this is interesting it would also be very useful to have the identity of the crystal structures that contain the ligand. Fortunately PBDe provide a series of web services that can be used to interrogate the database, together with a really useful page to help build the calls.
For our needs the query format is-
1 2 |
http://www.ebi.ac.uk/pdbe/api/pdb/compound/in_pdb/0D1 |
and the data is returned as-
1 2 |
{"0D1":["3uuc"]} |
The first part of the script creates a dialog box allowing the user to identify the column containing the PDB ligand ID, then we work through the rows in the workspace to generate the query string. This is then submitted to the webservice and the data returned.
The data is returned in json format as shown above, since a ligand may be present in many PDB structures the result can consist of a list of PDB codes. We use “join” to convert to a comma separated list, then create and populate a new column.
1 2 3 4 |
j = json.loads(molecule_record) pdbstring = ', '.join(j[PDBligand_id]) colPDBID = vtable.findColumnWithName('PDB_ID', 1) colPDBID.setValueFromString(r, pdbstring) |
The Vortex Script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
#Script to get a list of PDB entries that contain the compound defined in the PDB Chemical Component Dictionary, (from unichem search) # Python imports import urllib2 import urllib from com.xhaus.jyson import JysonCodec as json # Vortex imports import com.dotmatics.vortex.util.Util as Util import com.dotmatics.vortex.mol2img.jni.genImage as genImage import com.dotmatics.vortex.mol2img.Mol2Img as mol2Img import jarray import binascii import string import os input_label = swing.JLabel("PDB (for input)") input_cb = workspace.getColumnComboBox() panel = swing.JPanel() layout.fill(panel, input_label, 0, 0) layout.fill(panel, input_cb, 1, 0) ret = vortex.showInDialog(panel, "Choose PDB ligand column") if ret == vortex.OK: input_idx = input_cb.getSelectedIndex() if input_idx == 0: vortex.alert("you must choose a column") else: chosen_col = vtable.getColumn(input_idx - 1) rows = vtable.getRealRowCount() for r in range(0, int(rows)): PDBligand_id = chosen_col.getValueAsString(r) #http://www.ebi.ac.uk/pdbe/api/pdb/compound/in_pdb/0D1 api_url = 'http://www.ebi.ac.uk/pdbe/api/pdb/compound/in_pdb/%s' % PDBligand_id try: molecule_record = urllib2.urlopen(api_url).read() except urllib2.HTTPError: continue j = json.loads(molecule_record) pdbstring = ', '.join(j[PDBligand_id]) colPDBID = vtable.findColumnWithName('PDB_ID', 1) colPDBID.setValueFromString(r, pdbstring) vtable.fireTableStructureChanged() |
The script can be downloaded from here
Last Updated 31 May 2017
One thought on “Vortex script to get PDB information”
Comments are closed.