Macs in Chemistry

Insanely Great Science

Indexing the internet in a chemically intelligent manner

Some time ago I described a Safari extension that uses the chemicalize.org to index a web page for chemical content.

For an example of a “chemicalized” page have a look at this

As you can see below all molecules mentioned in the page become links that on a mouse over reveal the structure, they also provide a handy ribbon of structures across the top of the page that is useful for quickly scanning and navigation.

screnn1

A recent publication by Southan and Stracz, Extracting and connecting chemical structures from text sources using chemicalize.org. Journal of Cheminformatics 2013, 5:20 describes how this information is being used to provide better indexing of the internet in a chemically intelligent manner. They include a demonstration of a number of web pages and document sources that were indexed in this manner including PDF’s from the patent office.

chemicalize.org now has 15000 unique visitors a month – which is a huge growth compared to spring 2012. These users contribute to the database every day, making sure it’s up-to-date and contains new interests as well. The database today contains 327000 structures that were converted from 545000 names and identifiers coming from 367000 webpages.

These structures and links have now been uploaded to PubChem and if you are interested in what sort of molecules have been registered via chemicalize.org you can browse them on the PubChem website here



Comments

Marvin Updated

Marvin 5.12.2 has been released with a couple of bug fixes

Molecule Representation

  • Conversion from explicit hydrogen to implicit one removed stereo centers not having explicit hydrogen ligand.

Import/Export SMILES/SMARTS

  • Non ring bond information were imported as query strings from SMARTS.
  • After SMARTS import, those atoms that had no explicit aromatic property but had aromatic bond got query aromaticity property.



Comments

Marvin 5.12.1 released

New features and improvements

Import/Export

S orbitals and oval shaped s or p orbitals are imported from CDX/CDXML.

Bugfixes

Painting, Charge symbol on carbon atoms was missing when the atom numbers were visible and the display of carbon atom labels was turned off. When two atoms had more than one electron flow arrows between them, the electron flow arrows overlapped each other. The second electron flow arrow started from a wrong position when a single electron and an electron pair flow arrow started from an atom which had a lone pair and a radical as well.

Editing, Atom Lists and NOT Lists could not be created by typing atomic symbols separated with commas (e.g., "f,br,cl" or "!f,br,cl").

Import/Export, MRV and CML export wrote out characters incorrectly which are not supported by the character set. SDF files having invalid header could not been imported. Deuterium and tritium isotopes were converted to simple hydrogen atom if a molecule was exported to ChemAxon compressed MOL format (CSMOL). MolExporter.exportToObject() added an extra newline to SMILES. Nitrogens connecting two aromatic rings had radical after import if nitrogen was bracketed in the SMILES representation. Absolute stereo flag was missing during InChi export/import and InChiKey export.

Molecule Representation, Number of added implicit Hydrogen atoms were incorrect in some cases for positively charged sulfur atom.

Calculation, After canonical tautomer generation, the information of "double cis or trans" bond type might have been lost in certain cases.



Comments

Accessing the Chemical Identifier Resolver from Marvin

With the release of Marvin 5.12.0 users can now also access a custom web-service to extend name to structure conversion - for instance, with corporate IDs or common name dictionaries. I thought it might be useful to have a look at this new feature however I don’t have a corporate web service that I can use. This is where use of the Chemical Identifier Resolver (CIR) comes into play



Comments

Marvin 5.12.0 has been released

Marvin 5.12.0 has been released. This has an important updates for Mac OS X users, in that image to structure conversion using OSRA and text OCR for scanned documents is now supported on Mac OS X.

OCR_OSR

In addition Structure Checker configuration can be accessed via URL from MarvinSketch, Structure Checker application, and via Structure Checker API call. Users can now also access a custom web-service to extend name to structure conversion - for instance, with corporate IDs or common name dictionaries. Typing abbreviated group names is now case sensitive, When pasting unrecognised format onto the canvas, "Import as" dialog appears, and the user can choose the correct format. Structures can be copied as "Daylight SMARTS" and "ChemAxon SMARTS (CXSMARTS)" formats. The MMFF94 forcefield has been added to Generate3D and can also be used in the Conformer Plugin and Molecular Dynamics Plugin.

The complete release notes are available here



Comments

Marvin Update

ChemAxon have just announced an update to Marvin the latest version is 5.11.5

Structures copied from ChemDraw or Accelrys Draw could not be pasted onto MarvinSketch has been fixed.



Comments

Marvin Updated

Marvin from ChemAxon has been updated to version 5.11

New features and improvements

  • Image I/O
    • Recently added rendering options are now available to be set from MolPrinter API (Absolute label visibility, Peptide display type, R-group visibility, Any bond style, Lone pair rendering style, Charge rendering style). Documentation
  • MSketch GUI
  • MSketch applet
  • Graphical object handling
    • When an MMidPoint object was set as an end point for an MPolyLine, getting the MMidPoint location caused a StackOverFlowError.
  • Import/Export
    • Document to Structure (d2s)
      • Names broken over two lines with a hyphen (-) are now recognized.
      • Names followed by a superscript text, for instance, a reference or footnote number (e.g., "aspirin11") are now recognized.
    • Name to Structure (n2s)
      • In some cases, such as "4-methylthiophenylmethyl", there is an ambiguity whether "thiophenyl" refers to a compound derived from thiophene or thiophenol. Name to Structure now gives priority to the thiophenol related compound interpretation; though, "thiophenyl" by itself will still be supported as thiophene derivatives.
Bugfixes
  • Painting
    • If R-group visibility was turned off and any of the bonds had label(s) to paint, an ArrayIndexOutOfBounds exception was thrown.
  • Image I/O
    • Display parameters of charge, lone pair, peptide could not be set for molexporter. The default values were charge "in a circle", lone pair "as line", peptide "three letter format". Image copy also used these values.
  • Import/Export
    • MOL, SDF, RXN, RDF
      • Aliphatic query properties of atoms with query string were not read from MDL formats.
      • After importing Extended MOL files that contain superatom S-groups the orientation of S-groups could be changed.
      • Atom containing both aliphatic and unsaturated query properties were exported incorrectly to MDL formats.
      • SDF import returned structure with incorrect S-group embedding.
    • SMILES/SMARTS
      • SMILES T* option did not export all SDF fields, but only those which appeared in the first molecule.
  • Molecule Representation
    • S-groups
      • Two superatom S-groups being each others' parents caused infinite loop. In these cases, now java.lang.IllegalStateException is thrown.
    • Valence Check
  • Stereochemistry
    • Cloning of BicyclostereoDescriptor in RxnMolecules threw java.lang.ArrayIndexOutOfBoundException.
  • Clean 2D
    • Terminal methyl-group in phosphate-ester was cleaned incorrectly.
    • Clean2D could not handle condensed adamantane derivatives. Forum topic
  • Calculations
    • Other (HBDA, Huckel Analysis, ...)
      • The --pH command line option did not work in hydrogen bond acceptor-donor calculation.
  • Structure Checker
    • If fixer action was not defined, default fixer was not applied in structurechecker command line tool.

Comments

Scripting Vortex 3

ChemAxon's Calculator (cxcalc) is a really useful command line program in Marvin Beans and JChem that performs chemical calculations using calculator plugins. There are a lot of calculations provided by ChemAxon (e.g. charge, pKa, logP, logD), and others can be added by writing custom plugins, perhaps one of the most useful is the ability to calculate the acidic and basic pKa. Calculation of pKa is essential to get a reasonable hold on the LogD of a molecule. LogD is probably the most critical physicochemical property in drug discovery, it has a major influence on absorption, cell penetration, metabolism, CYP450 inhibition and induction, PGP transporter activity and activity at the HERG channel, and is often a critical component of any structure activity relationship.

These scripts make use of cxcalc to generate data columns in Vortex

Comments

ChemAxon US UGM

ChemAxon have announced the program and are calling for participants for ChemAxon's 4th US User Group Meeting (US-UGM) which will take place in San Diego, CA on September 27-28. Read More...
Comments

Plotting in Instant JChem

Plotting in Instant JChem Read More...
Comments

ChemAxon UGM

ChemAxon's US UGM, Sept 27-28, San Diego, CA  Read More...
Comments

Marvin Updated

Marvin Updated Read More...
Comments

Rule of 7 Applescript

An applescript to generate and plot physiochemical properties It uses cxcalc from ChemAxon to generate the data and Aabel from Gigawiz to plot the results. Read More...
Comments

Marvin Update

ChemAxon continue to update their applications and I just thought I'd mention this new Marvin feature. If you select a structure you get a menu option to search either PubChem or ChemSpider. Read More...
Comments

ChemAxon User Group Meeting

ChemAxon User Group Meeting: May 19-20th Training day: May 18th Read More...
Comments

ChemAxon Updates

Updates to Marvin and JChem. Read More...
Comments

Applescript Chemistry Workflow

A chemistry workflow created using Applescript, Chemaxon tools and Aabel. Read More...
Comments

LibraryMCS Review

Added LibraryMCS review. Read More...
Comments

ChemAxon User Group Meeting

ChemAxon EU and US user group meetings. Read More...
Comments

Marvin and JChem updates

ChemAxon have announced a couple of updates. Read More...
Comments

ChemAxon UGM

ChemAxon's 2009 European User Group Meeting Read More...
Comments

News Updates

A collection of software updates Read More...
Comments

Updates roundup

A list of updates that may be of interest.
Read More...
Comments

News Snippets

Just back from vacation, here are few snippets of news. Read More...
Comments

ChemAxon User Group Meeting in Boston

I see ChemAxon have announced details of their first US User group meeting Read More...
Comments

News

A few updates and news snippets Read More...
Comments

ChemAxon's 2008 European User Group Meeting

The program is now available for ChemAxon's 2008 European User Group Meeting being held on Wednesday and Thursday, May 7-8th at the Thermal Hotel Spa in Visegrad, Hungary. Read More...
Comments

Marvin and JChem

Marvin and JChem have been updated Read More...
Comments

Marvin Update

Marvin 5.0.1 has been released. Read More...
Comments

ChemAxon's 2008 European User Group Meeting

ChemAxon's 2008 European User Group Meeting will be held on Wednesday and Thursday, May 7-8 at the Thermal Hotel, in Visegrad, Hungary Read More...
Comments

Instant JChem Review

A review of Instant JChem an easy to use chemical structure database. Read More...
Comments

Mavin Review

I've written a review of Marvin from Chemaxon. Read More...
Comments