In a recent post Pat Walters highlighted the use of molfeat in a google colab notebook https://colab.research.google.com/github/PatWalters/practicalcheminformaticstutorials/blob/main/mlmodels/QSARin8lines.ipynb.
I thought I’d also mention other tools available from Datamol.io https://datamol.io/#datamol.
datamol.io is an open-source toolkit that simplifies molecular processing and featurization workflows for ML scientists in drug discovery.
Cheminformatics support is all built upon the open-source toolkit RDKit https://rdkit.org. It can be installed using conda
1 2 |
conda install -c conda-forge datamol |
Or pip
1 2 |
pip install datamol |
The latest version (0.9) appears to need Python 3.9 and RDKit version [2022.03, 2022.09]
There is a comprehensive series of tutorials https://docs.datamol.io/stable/tutorials/The_Basics.html and an extensive documentation.
License is Apache version 2.0.
If you would like to contribute details are on GitHub https://github.com/datamol-io/datamol.