A Jupyter Notebook to access structures and data from the Idler master worksheet

There was an interesting publication from the Todd group at UCL on Chemrxiv “Idler Compounds: A Simple Protocol for Openly Sharing Fridge Contents for Cross-Screeninghttps://chemrxiv.org/doi/10.26434/chemrxiv-2025-nqjb4.

Matt Todd is heavily involved in a number of open-source drug discovery projects and this paper highlights the opportunity this brings for sharing molecules that have been made for one project with other unrelated biological targets.

Since the structures are in the public domain it is possible for anyone to access them, details are on GitHub https://todd-lers.github.io/about/idler.html. However, whilst a Google sheet does provide easy access it is not chemically intelligent. This Jupyter notebook shows how to download the data, then import it into a Pandas data frame and then use RDKit to convert the SMILES strings to molecular objects, these can then be used to calculate physicochemical properties.

GetIdlerCompounds

A couple of points

In the first cell we use wget a tool for downloading files using HTTP, HTTPS, FTP and FTPS. Note that it is preceded by an exclamation mark.

This allows Jupyter to run shell commands within cells, the file is saved as example.tsv (tab separated format). This can then be imported into a pandas data frame using RDKit tools. However, the first row of the file contains a description

and the final row is a comment. We don’t want to import these rows so we skip the header and footer, using Python as the parser engine (Python parser engine if more feature complete).

The SMILES strings are then converted to RDKit molecular objects.

These can then be rendered using a couple of options, a variety of physicochemical properties are calculated and plotted using seaborne.

Related Posts