Un1Chem is a new web resource provided by the EBI, it is a ‘Unified Chemical Identifier’ system, designed to assist in the rapid cross-referencing of chemical structures, and their identifiers, between databases. Currently the uniChem contains data from 19 different databases:-
ChEMBL DrugBank PDBe (Protein Data Bank Europe) International Union of Basic and Clinical Pharmacology PubChem (‘Drugs of the Future’ subset) KEGG (Kyoto Encyclopedia of Genes and Genomes) Ligand ChEBI (Chemical Entities of Biological Interest). NIH Clinical Collection ZINC eMolecules IBM strategic IP insight platform and the National Institutes of Health Gene Expression Atlas IBM strategic IP insight platform and the National Institutes of Health. FDA/USP Substance Registration System (SRS) SureChem PharmGKB Human Metabolome Database (HMDB) Selleck PubChem (‘Thomson Pharma’ subset)
UniChem’s primary function is to maintain cross references between EBI chemistry resources. These include primary chemistry resources (ChEMBL, ChEBI and PDBeChem), and other resources where the main focus is not small molecules, but which may nevertheless contain some small molecule information (eg: Gene Expression Atlas).
Chambers, J., Davies, M., Gaulton, A., Hersey, A., Velankar, S., Petryszak, R., Hastings, J., Bellis, L., McGlinchey, S. and Overington, J.P. UniChem: A Unified Chemical Structure Cross-Referencing and Identifier Tracking System. Journal of Cheminformatics 2013, 5:3 (January 2013). DOI.
Searching uses either the source compound Id, InChI or InChI Key and has the following format
1 |
https://www.ebi.ac.uk/unichem/frontpage/results?queryText=SMWDFEZZVXVKRB-UHFFFAOYSA-N&kind=InChIKey&sources=&incl=exclude |
Since ChemBioDraw can generate InChi Keys I thought it might be interesting to write an applescript that access this service. The InChIKey is a short, fixed-length character signature based on a hash code of the InChI string. By definition, hashing is a one-way conversion procedure and the original structure cannot be restored from the InChiKey allowing confidential searchin
The Script
The script is shown below, the first part scripts the menu and sub menus of ChemBioDraw to get the InChI Key, we then construct the URL and then open in the default web browser.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
--https://www.ebi.ac.uk/unichem/frontpage/results?queryText=SMWDFEZZVXVKRB-UHFFFAOYSA-N&kind=InChIKey&sources=&incl=exclude tell application "CS ChemBioDraw Ultra" activate if not (enabled of menu item "Copy") then do menu item "Select All" of menu "Edit" if enabled of menu item "Copy" then do menu item "InChI Key" of menu "Copy As" of menu "Edit" set theInchikey to the clipboard else if enabled of menu item "Copy" then do menu item "InChI Key" of menu "Copy As" of menu "Edit" set theInchikey to the clipboard end if --display dialog theInchikey end tell set the_encode_theInchikey to encode_text(theInchikey, true, false) --display dialog the_encode_SMILES set unichem_url to "https://www.ebi.ac.uk/unichem/frontpage/results?queryText=" & the_encode_theInchikey & "&kind=InChIKey&sources=&incl=exclude" display dialog "Search uni1Chem" buttons {"Search", "Cancel"} default button 1 set the button_pressed to the button returned of the result if the button_pressed is "Search" then --to open in default web browser open location unichem_url else if the button_pressed is "Cancel" then end if on encode_text(this_text, encode_URL_A, encode_URL_B) set the standard_characters to "abcdefghijklmnopqrstuvwxyz0123456789" set the URL_A_chars to "$+!'/?;&@=#%><{}[]\"~`^\\|*" set the URL_B_chars to ".-_:" set the acceptable_characters to the standard_characters if encode_URL_A is false then set the acceptable_characters to the acceptable_characters & the URL_A_chars if encode_URL_B is false then set the acceptable_characters to the acceptable_characters & the URL_B_chars set the encoded_text to "" repeat with this_char in this_text if this_char is in the acceptable_characters then set the encoded_text to (the encoded_text & this_char) else set the encoded_text to (the encoded_text & encode_char(this_char)) as string end if end repeat return the encoded_text end encode_text on encode_char(this_char) set the ASCII_num to (the ASCII number this_char) set the hex_list to {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F"} set x to item ((ASCII_num div 16) + 1) of the hex_list set y to item ((ASCII_num mod 16) + 1) of the hex_list return ("%" & x & y) as string end encode_char |
1 |
The script can be downloaded from here
One thought on “Search un1Chem from ChemDraw”
Comments are closed.