Vortex script using MOE to calculate properties

MOE from Chemical Computing Group is probably best known as a graphical user interface to a suite of computational chemistry tools, whilst this is indubitably the means by which many users will interact with the program it is worth finding out about the command-line tools that are available. These tools are often accessed by pipeline tools such as Knime to allow rapid processing of large files. CCG provides four very useful command-line tools 

  • sdwash prepares SD files by carrying out a number of operations on the molecular data field, which include 2D depiction layout, hydrogen correction, salt and solvent removal, chirality and bond type normalization, tautomer generation, adjustment and enumeration of protonation states, and expansion of fragment abbreviations. 
  • sdfilter performs selective filtering of SD files, removing molecules which do not meet certain criteria, such as druglike/leadlike characteristics, or have calculated properties which fall outside of a specified range; e.g., acceptor/donor count, rotatable bonds, molecular weight, log P, etc. 
  • sdsort sorts SD files according to the molecular structure or by a data field, and can remove duplicates (taking tautomers into account) or compute differences between SD files. 
  • sddesc allows the calculation of some or all of the MOE molecular descriptors for each molecular entry, with the results stored in corresponding SD file data fields.

It is the last of these we will be using to add descriptors to a Vortex table.

Typing 

in a Terminal window gives a list of the options available.

Usage:

So while the command is designed to work with sdf files it can be used to generate ascii output as either “comma” delimited or “tab” delimited text. After a little experimentation I found this command gave the desired result

Note you have to include “-ascii” and “-tab”. In the above example I’ve only calculated the molecular weight but MOE can calculate many, many more descriptors. For a full list of the 300+ molecular descriptors, both 2D and 3D, available for calculation in MOE, contact Chemical ComputingGroup through their website, www.chemcomp.com . Extra, custom descriptors are very straightforward to code up in MOE’s Scientific Vector Language platform. It is important to note that if you submit a 2D structure file to the calculations any 3D descriptors generated will be inappropriate. 

When I first tried this command in a Vortex script I got no output and a number of cryptic error messages, I then included the full path to sddesc

But still got no output and got the following error message in the console,

After generous help from Matt, Dotmatics and CCG I worked out what was wrong. It seems that line 3 in $MOE/bin/sddec is

which will open a MOE/batch session, and “run” $MOE/bin/sddesc as an SVL file, using the arguments that were sent when $MOE/bin/sddesc was launched. The problem is that the program is running in a shell that does not have access to all the environment variables defined in my .bash_profile. We can define the environment variables needed by moebatch thus

The command to run sddesc then becomes

The remainder of the script parses the data, adds columns and headers, and then inserts the data. Again the beauty of this approach is that more descriptors can be added to the list for calculation and they will be automatically added to the Vortex table.

The Vortex Script

The script can be downloaded from here CCGcolumns.vpy.zip

Last Updated 7 Feb 2012

Related Posts