Vortex script for Accessing ChEMBL web services

ChEMBL is a manually curated chemical database of bioactive molecules . It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK. The database currently contains over 1.4 million unique structures with the associated activity at 10,579 different targets. It also acts as a repository for Open Access primary screening and medicinal chemistry data directed at neglected diseases.

Whilst the database can be downloaded the data can also be accessed via a web interface (shown below) and a series of web services

The currently available web services are :-

General Methods

Check API status

Compound Methods

Get compound by ChEMBLID
Get compound by Standard InChiKey
Get list of compounds matching Canonical SMILES
Get list of compounds matching Canonical SMILES using HTTP POST
Get list of compounds containing the substructure represented by a given Canonical SMILES
Get list of compounds containing the substructure represented by a given Canonical SMILES using HTTP POST
Get list of compounds similar to the one represented by a given Canonical SMILES, at a given cutoff percentage
Get list of compounds similar to the one represented by a given Canonical SMILES, at a given cutoff percentage using HTTP POST
Get image of a ChEMBL compound by ChEMBLID
Get individual compound bioactivities
Get alternative compound forms (e.g. parent and salts) of a compound
Get mechanism of action details for compound (where compound is a drug)

Target Methods

Get all targets
Get target by ChEMBLID
Get target by UniProt Accession Identifier
Get individual target bioactivities
Get approved drugs for target

Assay Methods

Get assay by ChEMBLID
Get individual assay bioactivities

We can use these web services to access ChEMBL data from within Vortex, the following scripts illustrate some of the means to do this.

UniprotID to ChEMBL target information.

When reading interesting results in the literature it is often useful to find out more about a particular target, this script uses the Uniprot ID to interrogate ChEMBL using the “Get target by UniProt Accession Identifier” web service to bring back target information. Because we can’t be sure what the column containing the Uniprot IDs will be entitled (e.g. Uniprot ID, uniprot_id, UNIPROTid etc) the first part of the script pops up a dialog asking the user to select the desired column.

We then construct the query string to access the appropriate web service, and then pull back the data. There is a little error trapping because some Uniprot IDs may not be in ChEMBL.

The data is returned in json format as shown below.

The last part of the script parses the data and populates the table.

The Vortex Script

Getting ChEMBL Target Data

After pulling back the target information associated with a particular Uniprot ID we may want to find out more about the compounds that have been tested against this target. The table now contains the ChEMBLID (highlighted in red) for the target and we can use this to interrogate ChEMBL to find all molecules that have been tested against this target.

To capture the desired ChEMBL ID we need to know the column and the particular cell containing the ID. To do this we can use an action from the user right-clicking on a cell to capture the contents.

We also capture the text in the “preferred_name” column to use as the label for a new workspace that will contain the results.

We then construct the URL needed to access the web service and then pull back the data.

The data in json format looks like this

The last part of the script parses the data into a cvs string, and then create column headers.

We then create a new workspace using all the items we created in the script.

The result is shown below, a new workspace showing all molecules that have been assayed against that target.

ou need to put this script in the “context” folder which is inside the “Vortex_Add-ons” folder.

The Vortex Script

ChEMBLID to SMILES script

Whilst the table above contains the textual information associated with an assay it does not include the chemical structure. This script uses the parentcmpdchemblid field and the https://www.ebi.ac.uk/chemblws/compounds/CHEMBL1.json web service to access the chemical data.

The data in json format looks like this

By parsing the data we can pull out the SMILES string and populate the table, Vortex them renders the SMILES to display the structure. It is also possible to modify the script to access the calculated properties and add them to the table.

The Vortex Script

Getting ChEMBL Compound Data Search

Now we have a workspace containing all the molecules tested against a particular target, the next step in the analysis might be to select an particularlyy interesting molecule and see if there is any more biological data in ChEMBL associated with the molecule.

To capture the desired ChEMBL ID we need to know the column and the particular cell containing the ID. To do this we can use an action from the user right-clicking on a cell to capture the contents.

We also capture the text in the “preferred_name” column to use as the label for a new workspace that will contain the result

The data is returned in this format and can be parsed to populate a new workspace.

The result is shown below.

The Vortex Script

I rows = [] for ba in j[‘bioactivities’]: values = [ba[‘parentcmpdchemblid’], ba[‘targetname’], ba[‘bioactivitytype’], ba[‘operator’], ba[‘value’], ba[‘units’], ba[‘assay_description’], ba[‘organism’], ba[‘reference’]] row = ([str(i) for i in values]) rows.append(row)

The four scripts can be downloaded from here.

These two scripts need to be added to the scripts folder.

ChEMBLid2SMILES.vpy
ChEMBLtargetfromUniprot.vpy

Whilst these two scripts need to be stored in the context folder which is in the VortexAddon folder.

ChEMBLTargetDataV1.vpy
ChEMBLCompoundDataV1.vpy

Page Updated 31 October 2014

Related Posts