In a recent post Pat Walters highlighted the use of molfeat in a google colab notebook https://colab.research.google.com/github/PatWalters/practicalcheminformaticstutorials/blob/main/mlmodels/QSARin8lines.ipynb.

I thought I’d also mention other tools available from Datamol.io https://datamol.io/#datamol.

datamol.io is an open-source toolkit that simplifies molecular processing and featurization workflows for ML scientists in drug discovery.

Cheminformatics support is all built upon the open-source toolkit RDKit https://rdkit.org. It can be installed using conda

Or pip

The latest version (0.9) appears to need Python 3.9 and RDKit version [2022.03, 2022.09]

There is a comprehensive series of tutorials https://docs.datamol.io/stable/tutorials/The_Basics.html and an extensive documentation.

License is Apache version 2.0.

If you would like to contribute details are on GitHub https://github.com/datamol-io/datamol.

Related Posts