MOE from Chemical Computing Group is probably best known as a graphical user interface to a suite of computational chemistry tools, whilst this is indubitably the means by which many users will interact with the program it is worth finding out about the command-line tools that are available. These tools are often accessed by pipeline tools such as Knime to allow rapid processing of large files. CCG provides four very useful command-line tools

sdwash prepares SD files by carrying out a number of operations on the molecular data field, which include 2D depiction layout, hydrogen correction, salt and solvent removal, chirality and bond type normalization, tautomer generation, adjustment and enumeration of protonation states, and expansion of fragment abbreviations.
sdfilter performs selective filtering of SD files, removing molecules which do not meet certain criteria, such as druglike/leadlike characteristics, or have calculated properties which fall outside of a specified range; e.g., acceptor/donor count, rotatable bonds, molecular weight, log P, etc.
sdsort sorts SD files according to the molecular structure or by a data field, and can remove duplicates (taking tautomers into account) or compute differences between SD files.
sddesc allows the calculation of some or all of the MOE molecular descriptors for each molecular entry, with the results stored in corresponding SD file data fields.

It is the last of these we will be using to add descriptors to a Vortex table.

Typing

sddesc -help

1 2	sddesc -help

in a Terminal window gives a list of the options available.

Usage:

sddesc &#91;options...] &#91;infiles...] &#91;-o outfile]

infile                     name of input file  (- for stdin)
outfile                    name of output file (- for stdout, . for null)

Options:
-help                      prints helpful information
-verbose                   enable information printing
-quiet                     disable information printing
-records       range       process only given range of records
-sdf               output SD file (default)
-ascii             output ascii comma separated files with SMILES
-keepfield     field       SD field to transfer to ASCII output file
-comma             comma/quote separated ASCII output (default)
-tab               tab separated ASCII output
-calc      code_list   calculate descriptors (comma separated)
-nocalc    skip_list   skip a set of descriptors (comma separated)
-class     class       calculate descriptors in class
-forcefield    filename    use given forcefield file for 3D descriptors

Range Syntax:
range = n                  equal to n
range = n-                 less than or equal to n
range = n+                 greater than or equal to n
range = n,m                n through m (inclusive)

sddesc [options...] [infiles...] [-o outfile]

infile name of input file (- for stdin)

outfile name of output file (- for stdout, . for null)

Options:

-help prints helpful information

-verbose enable information printing

-quiet disable information printing

-records range process only given range of records

-sdf output SD file (default)

-ascii output ascii comma separated files with SMILES

-keepfield field SD field to transfer to ASCII output file

-comma comma/quote separated ASCII output (default)

-tab tab separated ASCII output

-calc code_list calculate descriptors (comma separated)

-nocalc skip_list skip a set of descriptors (comma separated)

-class class calculate descriptors in class

-forcefield filename use given forcefield file for 3D descriptors

Range Syntax:

range = n equal to n

range = n- less than or equal to n

range = n+ greater than or equal to n

range = n,m n through m (inclusive)

So while the command is designed to work with sdf files it can be used to generate ascii output as either “comma” delimited or “tab” delimited text. After a little experimentation I found this command gave the desired result

sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

1 2	sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

Note you have to include “-ascii” and “-tab”. In the above example I’ve only calculated the molecular weight but MOE can calculate many, many more descriptors. For a full list of the 300+ molecular descriptors, both 2D and 3D, available for calculation in MOE, contact Chemical ComputingGroup through their website, www.chemcomp.com . Extra, custom descriptors are very straightforward to code up in MOE’s Scientific Vector Language platform. It is important to note that if you submit a 2D structure file to the calculations any 3D descriptors generated will be inappropriate.

When I first tried this command in a Vortex script I got no output and a number of cryptic error messages, I then included the full path to sddesc

 /Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

1 2	/Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/username/Desktop/ChemicalStructures/acetophenones.sdf

But still got no output and got the following error message in the console,

Vortex: /Applications/moe2011/bin/sddesc: line 3: /bin/moebatch: No such file or directory

1 2	Vortex: /Applications/moe2011/bin/sddesc: line 3: /bin/moebatch: No such file or directory

After generous help from Matt, Dotmatics and CCG I worked out what was wrong. It seems that line 3 in $MOE/bin/sddec is

$MOE/bin/moebatch -run $0 $*

1 2	$MOE/bin/moebatch -run $0 $*

which will open a MOE/batch session, and “run” $MOE/bin/sddesc as an SVL file, using the arguments that were sent when $MOE/bin/sddesc was launched. The problem is that the program is running in a shell that does not have access to all the environment variables defined in my .bash_profile. We can define the environment variables needed by moebatch thus

my_env = os.environ
my_env&#91;"PATH"] =  '/Applications/moe2011/bin/'+my_env.get('PATH', '')
my_env&#91;"MOE"] =  '/Applications/moe2011/'+my_env.get('$MOE', '')

my_env = os.environ

my_env["PATH"] = '/Applications/moe2011/bin/'+my_env.get('PATH', '')

my_env["MOE"] = '/Applications/moe2011/'+my_env.get('$MOE', '')

The command to run sddesc then becomes

    p = subprocess.Popen(&#91;'/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)
output = p.communicate()&#91;0]

p = subprocess.Popen(['/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)

output = p.communicate()[0]

The remainder of the script parses the data, adds columns and headers, and then inserts the data. Again the beauty of this approach is that more descriptors can be added to the list for calculation and they will be automatically added to the Vortex table.

The Vortex Script

import sys

# Uncomment the following 2 lines if running in console
#vortex = console.vortex
#vtable = console.vtable

sys.path.append(vortex.getVortexFolder() + '/modules/jythonlib')

import subprocess
import os

my_env = os.environ
my_env&#91;"PATH"] =  '/Applications/moe2011/bin/'+my_env.get('PATH', '')
my_env&#91;"MOE"] =  '/Applications/moe2011/'+my_env.get('$MOE', '')



# Get the path to the currently open sdf file
sdfFile = vortex.getFileForPropertyCalculation(vtable)

# Run sddesc on the file
#  /Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/swain/Desktop/ChemicalStructures/acetophenones.sdf 
p = subprocess.Popen(&#91;'/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)
output = p.communicate()&#91;0]

# Create new columns in table if needed
lines = output.split('\n')
colName = lines&#91;0].split('\t')
for c in colName:
    column = vtable.findColumnWithName(c, 1)
    vtable.fireTableStructureChanged()

keys = &#91;]
for i in lines:
    words = i.split('\t')
    if len(words) == 2:
    keys.append(words&#91;0])

# Parse the output
rows = lines&#91;1:len(lines)]
for r in range(0, vtable.getRealRowCount()):
    vals = rows&#91;r].split('\t')
    for j in range(0, len(vals)):
        column = vtable.findColumnWithName(colName&#91;j], 0)
        column.setValueFromString(r, vals&#91;j])

import sys

# Uncomment the following 2 lines if running in console

#vortex = console.vortex

#vtable = console.vtable

sys.path.append(vortex.getVortexFolder() + '/modules/jythonlib')

import subprocess

import os

my_env = os.environ

my_env["PATH"] = '/Applications/moe2011/bin/'+my_env.get('PATH', '')

my_env["MOE"] = '/Applications/moe2011/'+my_env.get('$MOE', '')

# Get the path to the currently open sdf file

sdfFile = vortex.getFileForPropertyCalculation(vtable)

# Run sddesc on the file

# /Applications/moe2011/bin/sddesc -ascii -tab -calc Weight /Users/swain/Desktop/ChemicalStructures/acetophenones.sdf

p = subprocess.Popen(['/Applications/moe2011/bin/sddesc', '-ascii', '-tab', '-calc', 'Weight,SlogP,mr,TPSA', sdfFile], stdout=subprocess.PIPE, env=my_env)

output = p.communicate()[0]

# Create new columns in table if needed

lines = output.split('\n')

colName = lines[0].split('\t')

for c in colName:

column = vtable.findColumnWithName(c, 1)

vtable.fireTableStructureChanged()

keys = []

for i in lines:

words = i.split('\t')

if len(words) == 2:

keys.append(words[0])

# Parse the output

rows = lines[1:len(lines)]

for r in range(0, vtable.getRealRowCount()):

vals = rows[r].split('\t')

for j in range(0, len(vals)):

column = vtable.findColumnWithName(colName[j], 0)

column.setValueFromString(r, vals[j])

The script can be downloaded from here CCGcolumns.vpy.zip

CCGcolumns.vpy Download

Last Updated 7 Feb 2012

Vortex script using MOE to calculate properties

The Vortex Script

Related Posts

TabPFN-2.5

Use the Foundation Models framework to access the on-device LLM that powers Apple Intelligence.