Macs in Chemistry

Insanely Great Science

A couple of updates

The point and click data analysis tool Wizard Pro has been updated. In particular this update address a couple of issues

  • Preserve filter selections when switching between tables
  • Correctly parse numbers surrounded by spaces in CSV files 
  • Fix a bug where a blank header cell in an Excel spreadsheet caused subsequent columns not to be imported 
  • Fix some issues with PDF export of model images
  • Fix a crash when stacking tables with indicator variables

The Reference Management package Bookends has also been updated,

  • Get DOI was updated to deal with changes made by CrossRef
  • This includes dealing with changes made in the way CrossRef encodes accented characters.
  • Updated Import From Existing Bibliography to deal with changes made by CrossRef
  • Updated Bookends browser to detect DOIs on the Google Scholar web site to deal with changes made by Google



Comments

Indexing the internet in a chemically intelligent manner

Some time ago I described a Safari extension that uses the chemicalize.org to index a web page for chemical content.

For an example of a “chemicalized” page have a look at this

As you can see below all molecules mentioned in the page become links that on a mouse over reveal the structure, they also provide a handy ribbon of structures across the top of the page that is useful for quickly scanning and navigation.

screnn1

A recent publication by Southan and Stracz, Extracting and connecting chemical structures from text sources using chemicalize.org. Journal of Cheminformatics 2013, 5:20 describes how this information is being used to provide better indexing of the internet in a chemically intelligent manner. They include a demonstration of a number of web pages and document sources that were indexed in this manner including PDF’s from the patent office.

chemicalize.org now has 15000 unique visitors a month – which is a huge growth compared to spring 2012. These users contribute to the database every day, making sure it’s up-to-date and contains new interests as well. The database today contains 327000 structures that were converted from 545000 names and identifiers coming from 367000 webpages.

These structures and links have now been uploaded to PubChem and if you are interested in what sort of molecules have been registered via chemicalize.org you can browse them on the PubChem website here



Comments

OSRA Updated

OSRA 2.0.0 is available for download.

OSRA (Optical Structure Recognition Application) is a utility designed to convert graphical representations of chemical structures and reactions, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES or MOL files –

  • Significantly improved recognition rates. Full details of the validation are available.
  • Added recognition of Iodine, wavy bonds, etc.
  • Completely modified confidence function (values not compatible with the earlier versions).
  • Updated table detection and removal routines.
  • Created binary package for Linux (statically linked, should work on all modern Linux systems).
  • Windows and OSX versions now support multi-threading processing of PDF files.

With this release the distribution model has been changed a bit: while the source code is available free as before, to offset the development costs the installable packages for Windows, Mac OSX and Linux can be purchased for a fee.



Comments

Scripting Vortex, using OpenBabel fastsearch

One thing I’ve needed to do a couple of times recently is give an idea of how many similar compounds are available to the set of compounds I’m currently viewing. For example in designing a fragment library it is very useful to know for a particular fragment how many similar fragments are commercially available. Or when looking at the results of a high-throughput screen how many similar analogues to a particular hit were also screened.

To do this we need a way of doing a rapid similarity search of the reference database. I use OpenBabel in particular using the fast search capability with molecular fingerprints.

In Scripting Vortex 13 there is a new script to do this.

There are many more scripts and hints here.



Comments

International Workshop on OpenCL (IWOCL)

This might be of interest to those involved in developing scientific applications that take advantage of the GPU.

The International Workshop on OpenCL (IWOCL) is an annual meeting of vendors, researchers and developers to promote the evolution and advancement of the OpenCL standard. The meeting is open to anyone who is interested in contributing to, and participating in the OpenCL community. IWOCL is the premier forum for the presentation and discussion of new designs, trends, algorithms, programming models, software, tools and ideas for OpenCL. Additionally, IWOCL provides a formal channel for community feedback to OpenCL promoters and contributors.

May13-14 2013 Georgia Institute of `Technology, more detailed here.



Comments