A couple of updates
The point and click data analysis tool Wizard Pro has been updated. In particular this update address a couple of issues
- Preserve filter selections when switching between tables
- Correctly parse numbers surrounded by spaces in CSV files
- Fix a bug where a blank header cell in an Excel spreadsheet caused subsequent columns not to be imported
- Fix some issues with PDF export of model images
- Fix a crash when stacking tables with indicator variables
The Reference Management package Bookends has also been updated,
- Get DOI was updated to deal with changes made by CrossRef
- This includes dealing with changes made in the way CrossRef encodes accented characters.
- Updated Import From Existing Bibliography to deal with changes made by CrossRef
- Updated Bookends browser to detect DOIs on the Google Scholar web site to deal with changes made by Google
R reaches version 3.0.0
R the language and environment for statistical computing and graphics has now reached version 3.0.0.
Whilst there is a list of new features and updates, those listed as most significant are shown below.
- Packages need to be (re-)installed under this version (3.0.0) of R.
- There is a subtle change in behaviour for numeric index values 2^31 and larger. These never used to be legitimate and so were treated as NA, sometimes with a warning. They are now legal for long vectors so there is no longer a warning, and x[2^31] <- y will now extend the vector on a 64-bit platform and give an error on a 32-bit one.
- It is now possible for 64-bit builds to allocate amounts of memory limited only by the OS. It may be wise to use OS facilities (e.g. ulimit in a bash shell, limit in csh), to set limits on overall memory consumption of an R process, particularly in a multi-user environment. A number of packages need a limit of at least 4GB of virtual memory to load. 64-bit Windows builds of R are by default limited in memory usage to the amount of RAM installed: this limit can be changed by command-line option --max-mem-size or setting environment variable RMAXMEM_SIZE.
- Negative numbers for colours are consistently an error: previously they were sometimes taken as transparent, sometimes mapped into the current palette and sometimes an error.
There is a list of data analysis packages for MacOSX here.
Wizard Updated
Wizard the point-and-click statistical analysis for Mac has been updated.
The focus of this release is supporting several new import formats, including the oft-requested XLSX and Numbers document formats.
A major change in the product line is that reading and writing R files and generating R code has now "graduated" from the Pro version and is now available in the Standard version. But Pro users shouldn't feel left out: with this release, Support for importing binary SAS files and generating SAS code -- both features only available in the Pro version.
New Features:
- Import XLSX spreadsheets
- Import Numbers documents
New Features (Pro Version):
- Import SAS binary files (.sas7bdat)
- Import plain-text data with SAS commands (.sas)
- Generate SAS model estimation commands
New Features (Standard Version):
- Import/export R files
- Generate R commands
Bug fixes
- Fix a crash when zero observations are included in the Model view
- Fix a bug when importing multiple sheets in XLS documents
- Fix a bug where Q-Q plots were not properly exported as PDF
There is a listing of data analysis tools for the Mac here.
A Review of StarDrop 5.3
I’ve just written a review of Stardrop an application from Optibrium that was designed to aid decision making for scientists involved in drug discovery that has recently been updated.
- Virtual Library Enumeration – The Nova plug-in module for StarDrop now has the added ability to quickly and easily enumerate a virtual library based on a template scaffold that you define with substitution points and variable fragments. You can sketch the groups to substitute at each point, select them from a user-defined or centrally administered library, or take them from a decomposition of another series using the R-group analysis tool in StarDrop
- Data visualisation - now allows you to apply interactive filters to your graphs and plots to quickly focus on the most interesting compounds. StarDrop now also supports the analysis of dates allowing you to explore variations of properties or scores with time
- Clustering - this new tool enables you to easily identify groups of similar compounds within a data set, based on either their structural similarity or properties
- Dataset Filtering - this helps you to remove compounds from a data set with unwanted sub-structures or property values. You can define any number of criteria with which to filter a data set
- Duplicate Removal - when combining compound data from multiple sources it’s common to end up with multiple copies of the same compound in a single data set. The duplicate removal tool makes it easy to find these and choose the entries that you want to keep.
- ADME QSAR – new model for predicting log([Brain]:[Blood]) (the old model remains available for consistency with previously calculated results)
- StarDrop now includes a FieldAlign module, using Cresset's molecular Field technology, provides a unique, 3-dimensional (3D) insight into the biological activity, properties and interactions of your compounds.
There is a comprehensive list of software reviews here.
Chartsmith added to list of data analysis tools
I’ve just added Chartsmith to the list of data analysis tools.
Chartsmith is the premier charting and graphing application for Mac OS X. Built from the ground up on Mac OS X technologies, this application will make you and your data hum. Whether for scientific data visualization, for business presentation, or for graphics publishing, Chartsmith makes charting and graphing quick and easy.

Chartsmith supports a variety of chart types and can import from Excel or ascii text files. There is also Applescript support for automating workflows.
There is a comprehensive listing of data analysis tools for Mac OS X here.
ElementalDB
Way back in the distant past when I first joined the Pharma industry I remember working with a dumb terminal running sub-structure queries on a remote mainframe that seemed to take for ever on our relatively modest corporate database, returning the results would then bring our network to a crawl much to the annoyance of my colleagues. I’ve just downloaded ElementalDB from Dotmatics, this an iPad application that does a substructure search of a 1,200,000 structure database in less than a second.
Wizard Pro updated
The data analysis tools Wizard and Wizard Pro have been updated to improve blank cell detection during excel imports.
Wizard supports the most common statistical tests and models, including...
Univariate Tests Shapiro-Wilk test of normality , 1-sample Kolmogorov-Smirnov (normality and uniformity) , Pearson's goodness-of-fit (equal proportions)
Bivariate Tests Pearson's goodness-of-fit (chi-square) , t-test and ANOVA , Correlation (Pearson product-moment) and R² , Mann-Whitney and Kruskal-Wallis , 2-sample and N-sample Kolmogorov-Smirnov
Multivariate Models Linear regression (OLS) , Weighted linear regression (WLS) , Poisson and geometric regression , Logistic regression (Logit) and Probit , Multinomial Logit and Ordered Probit , Negative Binomial (NegBin-2) , Cox Proportional Hazards
Regression Features Fixed effects , Robust standard errors , Clustered standard errors , Joint significance tests , Odds ratios , Residual analysis , Interactive prediction assistant
There are many more data analysis tools for Mac OS X here.
WaveMetrics Updates
XOP Toolkit 6.30 is now shipping. This release adds support for Xcode 4.3.2 through 4.6 and for Visual C++ 2012. This release is mostly to keep up with Xcode 4 changes and to add Visual C++ 2012 sample XOPs and documentation. As described in the release notes (Appendix C of the XOP Toolkit 6 manual), a side-effect of keeping up with Xcode 4 is that XOPs compiled by XOP Toolkit 6.30 require Igor Pro 6.20 or later. The requirement of Igor Pro 6.20 is the reason for bumping the XOP Toolkit version from 6.02 to 6.30. "6.30" was chosen because that is the contemporaneous Igor version. If you are a licensed XOP Toolkit 6 user, this is a free update.
IGOR Pro 6.3 has been updated.
New Features
Added a Batch Curve Fitting package: allows you to fit batches of data to the built-in or user-defined fitting function of your choice. A "batch" is a collection of similar data sets stored in waves to which a common fitting function, initial conditions, and weighting and masking waves have been applied. Each data set may be stored in a waveform, an XY pair, or in the columns of a 2D wave. Added the Scatter Dot Plot Panel. Scatter Dot Plots are one part category plot, one part scatter plot, and one part histogram. Like category plots they show total counts for multiple data sets, each labeled on the X axis. Like scatter plots they provide a sense of the data's distribution. Like histograms they sort data into bins of points in which all values fall into a range. The Multipeak Fitting 2 package now supports constraints on peak coefficients. NewImage supports direct RGBA color image plots. The FilterFIR notch filter length had been limited to 4001 points. Now the limit is 2147483647 points, which makes the minimum notch width 0.000107% of the sampling frequency.
There is a page of data analysis tools here.
Scripting Vortex 12
In the previous tutorial we made use of the Virtual Computational Chemistry Laboratory web service to calculate aLogP and LogS, both these results were returned in a simple text format. More recently there has been an increased use of JSON format for data exchange.
JSON, or JavaScript Object Notation, is a text-based open standard designed for easy human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects. Despite its relationship to JavaScript, it is language-independent, with parsers available for many languages including including C, C++, C#, Java, JavaScript, Perl, Python.
Molinspiration provide a number of cheminformatics tools but also provide a RESTful web service these web services can be used to calculate a range of molecular properties and bioactivity predictions.
The output from both web services is available either as a JSON string or plain text, the web service can be accessed by submitting a URL
Full details of the script are here.

Cheminformatics on a Mac
I gave a talk at the Cambridge Cheminformatics meeting last week, I’ve put the slides here. It was more of a demonstration than a talk but the slides give an overview and links to the various tools.
Updated
A couple of people have asked for a pdf version of the slides for download.
Scripting Vortex:- Accessing a web service
I’ve just added the latest script for Vortex.
In previous scripts we have generated data using a local Java program, C program, PERL script, and SVL program. In this tutorial rather than have a local application generate the data we will use a web service.

There are more scripts on the Hints and Tutorial pages.
StarDrop 5.3 is now available
Optibrium have just announced that StarDrop 5.3 is now available, including many new features, the highlights include:
- Virtual Library Enumeration – The Nova plug-in module for StarDrop now has the added ability to quickly and easily enumerate a virtual library based on a template scaffold that you define with substitution points and variable fragments. You can sketch the groups to substitute at each point, select them from a user-defined or centrally administered library, or take them from a decomposition of another series using the R-group analysis tool in StarDrop
- Data visualisation - now allows you to apply interactive filters to your graphs and plots to quickly focus on the most interesting compounds. StarDrop now also supports the analysis of dates allowing you to explore variations of properties or scores with time
- Clustering - this new tool enables you to easily identify groups of similar compounds within a data set, based on either their structural similarity or properties
- Dataset Filtering - this helps you to remove compounds from a data set with unwanted sub-structures or property values. You can define any number of criteria with which to filter a data set
- Duplicate Removal - when combining compound data from multiple sources it’s common to end up with multiple copies of the same compound in a single data set. The duplicate removal tool makes it easy to find these and choose the entries that you want to keep.
- ADME QSAR – new model for predicting log([Brain]:[Blood]) (the old model remains available for consistency with previously calculated results)
RapidMiner 5.3 released
RapidMiner v5.3 has been released. RapidMiner provides data integration, ETL, data analysis, and reporting in a single application, with an intuitive, drag and drop visual environment for designing and deploying customized analytical processes, and has been downloaded by over three million users worldwide. In addition to over 100 performance, usability and stability improvements, the new version delivers more powerful data analysis operators and access to popular data sources including Microsoft Excel™ 2007 and SAS™. RapidMiner v5.3 is also fully integrated with the Rapid-I Marketplace, where users can discover and install new RapidMiner extensions published by a growing list of independent developers. Popular extensions include text and web mining, image mining and recommenders. Various Extensions for RapidMiner are available from the update server (go to the "Help" menu and select "Update RapidMiner")
RapidMiner v5.3 also features
- More than 20 new functions for analysis and data handling, including multiple new aggregation functions;
- File operators, including Move File, Rename File, Copy File, Create Directory, and Delete Files, directly from RapidMiner; and
- A macro viewer that shows macros and their values in real time during process execution, for better debugging.
There is a comprehensive list of data analysis tools for Mac OS X here.
DataGraph 3.1 released
I just noticed that DataGraph has been upgraded to version 3.1
DataGraph is a simple and powerful graphing application for Mac OS X. It is a great companion for Excel, Numbers or any of the big statistical packages. Simple because it is very easy to draw plots, bar graphs, and fit functions. Start typing in data and the graph immediately shows up. Pick from the initial template list and modify the data, change colors, resize easily and interactively.
Included on the page of data analysis tools
Data Extractor Updated
Data Extractor solves the problem that often advanced users have, the necessity to extract data available in text format on one or more files (often thousands and thousands of files) , and moving them inside a table or a database in an ordered and structured form with fields and records for archiving and successive processing. Data extractor can parse thousands and thousands of file in few seconds and collect all the data inside these files using simple instructions on how to recognise data, how to extract them and where to put these data inside Data Extractor tables, ready to be exported
Included on the page of data analysis tools
Maple 16.02 released
Maple 16.02, a maintenance update, is available to all users running Maple 16. This update contains enhancements to many areas, including:
- Maple 16 now works on Macintosh OS X 10.8
Connectivity features have been extended to include: MATLAB® 2012b and Microsoft Visual Studio 2012
Physics. Enhancements were made to several areas of the Physics package, including
- Algebraic manipulations of dot products of quantum operators (possibly tensorial) and eigenstates
- Algebraic simplification taking into account commutation/anticommutation rules
- Simplification of tensor products involving the sum rule for repeated indices in the presence of symmetric and antisymmetric tensors
- Dagger, Commutator, AntiCommutator, Bracket
- Normal forms of products of quantum operators
- Saving and loading Physics setup across sessions
- Allowing selective partial clearing of setting using Setup(clear, <…>)
Copying and pasting of textbook display (typesetting = extended)
Enhancements were made to memory management/garbage collection.
- Improvements were made to the display of data tables and to custom palettes.
I’ve updated the page of data analysis tools
Tableau Mobile
A few days ago I mention Spotfire for the iPad and a couple of readers sent in details of similar applications.
Tableau mobile is an iPad front-end to an analytics server. Create interactive reports and dashboards in Tableau Desktop then publish them to Tableau Server for secure access on your desktop, or on the web or with your iPad.
Similarly SAP BusinessObjects Mobile connects to the SAP BusinessObjects Business Intelligence platform.
Spotfire for iPad
Spotfire for the iPad requires access to a TIBCO Spotfire Web Player Server. By default, the app is connected to our public demo gallery so you can start experiencing data in Spotfire immediately. After that, you can connect to your internal server, or you can connect to public servers and explore Spotfire analytical tools.

Now added to the mobile science page.
Mathematica 9 is released
Wolfram have announced the release of Mathematica 9 with a host of new features
Optimize your workflow with the Wolfram Predictive Interface The Wolfram Predictive Interface makes it easy to find and use the power of Mathematica 9. The Input Assistant's context-sensitive autocompletion and dynamic highlighting help you discover and enter commands, and the next-computation Suggestions Bar offers optimized suggestions for what to do next. It's the next step in our ongoing Compute-as-You-Think initiative that began with free-form linguistic input.
Examine social networks with built-in links to social media Mathematica 9 introduces a full suite of social network analysis features including community detection, cohesive groups, and centrality measures, plus built-in links to Facebook, LinkedIn, Twitter, and more. It also adds new capabilities for network flows and new graph distributions.
Work with systemwide support for units Mathematica 9 introduces a new unit system containing more than 4,500 different units, all integrated with Wolfram|Alpha's sophisticated unit interpretation system. From unit conversion to dimensional analysis, Mathematica provides you with all the tools you need to work with, and extract properties from, units and quantities.
Use survival analysis, random processes, and other expanded capabilities in data science and visualization Mathematica offers more statistical distributions than any other system, including specialized coverage of finance, medicine, and engineering. Mathematica 9 adds survival and reliability analysis; full support for random processes including queues, time series, and stochastic differential equations; a complete set of customizable gauges for dashboards and reports; and systemwide support for automatic legends for plots and charts.
Integrate R code into your Mathematica workflow Mathematica 9 offers built-in ways to integrate R code into your Mathematica workflow, allowing data exchange between Mathematica and R and execution of R code from within Mathematica. With RLink, R users can use thousands of functions from across the full Mathematica system.
Deploy interactive documents with enhanced capabilities Instantly create documents in the Computable Document Format (CDF) to present interactive charts of results, show dynamic models, or prototype your next application, and deploy them to the web or desktop. With Mathematica Enterprise Edition, you can deploy CDFs with live data and other enhanced features.
Perform powerful 3D volumetric and out-of-core image processing Mathematica 9 scales up performance to very large 2D- and 3D-volumetric images using out-of-core technology, and builds in a hardware-accelerated rendering engine for 3D images and volumes. Mathematica 9 also adds feature tracking, face detection, image enhancements, and other highly optimized algorithms to perform comprehensive image analysis.
Use integrated analog and digital signal processing Filter and analyze sound, images, and multidimensional data with Mathematica 9's signal processing capabilities. Instantly design and deploy interactive filters and simulate them with Wolfram SystemModeler.
Visualize with new customizable gauges and built-in legends Mathematica 9 adds a complete set of customizable interactive gauges for dashboards and reports, with built-in support for units. Systemwide support for automatic legends for plots and charts means legends with any style or layout can be added to arbitrary content.
I’ve updated the page of data analysis tools
Wizard Pro 1.1.1
Wizard Pro has been updated this release fixes a critical bug that prevented saved multivariate models from opening on Mountain Lion. Wizard Pro is $199.99 for Mac OS X 10.6 through 10.8 and is only available in the Mac App Store. Wizard Pro is a multivariate statistics program for data analysis and exploration. The software keeps all work (tables, results, predictions) in a single document with an iTunes-like navigator and provides "interactive interfaces" for querying data. It includes basic statistics tests, regression models and results, stacking and joining of tables, indicator variables with custom logic, and more.
There is a page of data analysis tools here.
Addded TOPCAT to data analysis tools
I’ve added TOPCAT to the list of data analysis tools.
TOPCAT is an interactive graphical viewer and editor for tabular data. Its aim is to provide most of the facilities that astronomers need for analysis and manipulation of source catalogues and other tables, though it can be used for non-astronomical data as well. It understands a number of different astronomically important formats (including FITS and VOTable) and more formats can be added. It offers a variety of ways to view and analyse tables, including a browser for the cell data themselves, viewers for information about table and column metadata, and facilities for 1-, 2-, 3- and higher-dimensional visualisation, calculating statistics and joining tables using flexible matching algorithms
I also noticed that Venuz has been updated
Veusz is a GUI scientific plotting and graphing package written in Python. It is designed to produce publication-ready Postscript or PDF output. SVG, EMF and bitmap export formats are also supported. The program runs under Unix/Linux, Windows or Mac OS X, and binaries are provided. Data can be read from text, CSV or FITS files, and data can be manipulated or examined from within the application
PublishPlot has been updated
PublishPlot has been updated and is now available from the Mac App store, it is a very handy tool for creating publication quality plots from any text based table of data.
PublishPlot is scriptable using either Applescript or Python.
New in version 1.1 is a tool bar to display x,y location when hovering over a plot, new fitting options and the ability to apply mathematical transforms to any two curves. There are also new export functions and bug fixes.
PublishPlot is included on the page of data analysis tools
MagicPlot Viewer
Sometimes you just need to have a quick look at a data file and Magic Plot Viewer offers the means to do this.
- Supports text files with different structure
- Auto detection of column delimiter and decimal separator
- Multiple columns for X and Y can be set
- All MagicPlot data navigation tools (zoom, hand, scrolling...)
- Equal scale for all thumbnails can be set
- Shows data point coordinates in status bar
- Quick export and printing of plots
- Additional support of image files (PNG, GIF, JPEG, BMP)
- Fullscreen mode
Added to the data analysis tools page
Viewing Docking results in Vortex using Astex Viewer
I recently wrote a review of ForgeV10 from Cresset in which I actually imported the results into Vortex to do the analysis. There were however two issues with doing this, firstly interpretation of the 3D structures is sometimes difficult, this can be resolved by creating a 2D rendering of the structure. The other issue is trying to interpret the docking pose whilst looking at the analysis of the results in say a Vortex scatter plot.
I’ve been working with Mike Hartshorn and the people at Dotmatics who have incorporated OpenAstexViewer (a 3D molecule viewer) into the application you can read the full article here..
Wizard Pro
I’ve just added Wizard Pro to the page of data analysis tools.
Wizard Pro can Import spreadsheets and CSV, plus files from SPSS: .sav, .por, and .sps files, Stata: .dta and .dct files, R: .RData files. EExport data as CSV and JSON, plus files for SPSS: .sav binary files, Stata: .dta binary filesR: .RData binary files. It can generate regression commands suitable for verifying results in SPSS, Stata, and R, It is fully multi-core — regressions run instantly and supports millions of rows and thousands of columns — no hard limits.
Vortex script exchange
Vortex is an advanced data analysis package that understands chemistry, the capabilities of Vortex can be extended by the use of scripts. I’ve now created Vortex script exchange that users can use to download or share scripts.
There are also a series of scripting tutorials here to provide a starting point for creating new scripts.
Hopefully these scripts will be valuable to you.
Scripting Vortex 9
I recently wrote a review of ForgeV10 in which I imported the results into Vortex for analysis. This works fine the only issue being the resulting structures are 3D which makes interpretation of the structure sometimes difficult to discern, this script uses OpenBabel to create SMILES which can be rendered as 2D images.
iOS and OS X Graphing Library
iOS and OS X Graphing Library Free For Development
VVI today announced the availability of it’s graphing library for iPhone, iPad, iPod touch and Macs. Version 10.8.3 of the graphing libraries and frameworks, aka Vvidget Code, brings the following achievements:
- Supports deployment to OS X versions 10.6 to 10.8 (Macs) and iOS versions 4.3 to 5.1 (iPhone, iPad and iPod Touch).
- Supports development on OS X versions 10.6 to 10.8 and Xcode 3.2 to 4.4.1.
- Uses native API on deployment platforms for the fastest and most robust possible implementation. That is, Cocoa Touch for the iPhone, iPad and iPod Touch and Cocoa for the Mac.
- Use for development is free.
- Eleven Vvidget-based applications available from VVI on the iTunes App Store for iPhone, iPad and iPod Touch and on the Mac App Store demonstrate Vvidget Code in actual situations.
- Applications based upon Vvidget Code are free-standing and require no additional installs. Vvidget Code itself can be installed using package installers or shared using free-standing Xcode projects.
- Download and install instructions are at: Download And Install Vvidget Code
Please email sales@vvi.com for additional information.
PublishPlot
I’ve just added PublishPlot to the page of data analysis tools.
PublishPlot Features
- Quickly convert any table of data into a plot
- Customize all features of the plot
- Easily scale the plot to any size while conserving relative sizes of plot features
- Annotate the plot with labels and arrows
- Add error bars
- Do simple data transformations including fits and spline interpolations
- Plot arbitrary functions of x
- Edit data in PublishPlot by simple plain-text editing methods
- Export a plot to a PDF file (or simply drag it to your desktop or to another application)
- Create and transform plots using AppleScripts or Python scripts

Mjograph Updated
MjoGraph is an X-Y graph editor that runs on Mac OSX or on other platforms with Java. It is well customized for researchers, especially in the field of science, whose research work includes computer simulations and visualization of their numerical results.
There are many more data analysis tools listed here
StarDrop update released
A new version of StarDrop is now available. The new features include
- FieldAlign – this new module, using Cresset's molecular Field technology, provides a unique, 3-dimensional (3D) insight into the biological activity, properties and interactions of your compounds, helping to guide the design of novel, potent compounds with a high chance of success, there is a review of the FieldView and FieldAlign here.
- R-Group analysis – analyse a chemical series to interactively visualise the impact of variations to R-groups, linkers, atoms or fragments on compound properties. Explore the SAR of your chemistry, identify new optimisation strategies and automatically enumerate the missing combinations
- ADME QSAR – new models for predicting 2C9 pKi, BBB category and P-gp category (the old models remain available for consistency with previously calculated results)
- Nova – now available with the ability to select compounds using a combination of properties and chemical diversity

StarDrop 5.2 coming soon
Optibrium have just announced the imminent release of the next version of StarDrop
The highlight of this new release is the addition of a new plug-in module that provides access to Cresset's FieldAlign™ technology, which offers a unique, 3-dimensional insight into the biological activity of your compounds. This new development is the first result of the technology exchange, between Optibrium and Cresset, and adds another powerful tool to StarDrop that will enable you to understand the three-dimensional (3D) structure activity relationship (SAR) of your chemistry Version 5.2 also introduces new enhancements of StarDrop's core capabilities, in particular a flexible tool for performing automatic R-group analysis. This new feature analyses a chemical series to interactively visualise the impact of variations to R-groups, linkers, atoms or fragments on compound properties to help chemists to further understand the SAR of their chemistry and identify new optimisation strategies
There are reviews of StarDrop and FieldAlign on the software reviews page and a listing of data analysis packages here.
A Review of CheS-Mapper
I’ve just completed a review of CheS-Mapper.
CheS-Mapper (Chemical Space Mapper) is a 3D-viewer for chemical datasets of small molecules, a recent publication in the Journal of Chemiformatics describes the application DOI: 10.1186/1758-2946-4-7, In addition more information is available on the wiki page. Whilst there are many applications for the visual analysis of data, very few provide the tools needed to handle chemical structures, CheS-Mapper is a java application that runs under Mac OSX (I only tested Lion) based on the Java libraries Jmol, CDK, WEKA, and utilizes OpenBabel and R, that provides an interesting means to explore chemical data sets.

There a complete list of software reviews here.
Using Flot and Chemical Identifier Resolver
I recently wrote a couple of Applescripts that use the Chemical Identifier Resolver (CIR) a web service that performs various chemical name to structure conversions and it occurred to me that is should be possible to use this service to generate images for use as popups on a graph in the same way that I’ve previously described using Flot and ChemSpider. This works well but relies on the structure already being in the ChemSpider database, for novel structures we need a service for generating the image from a chemical identifier. CIR provides a simple web service for doing exactly this, for example submit a SMILES string and it can return a 2D image.
This tutorial shows how to create an interactive plot using Flot and CIR
Data Extractor
Data Extractor allows to extract data from files and collect them ready to be exported for later use Data is collected in records with custom specified fields inside an internal table. Data can be exported at any time. Data extractor can parse thousands and thousands of file in few seconds and collect all the data inside these files using simple instructions on how to recognise data, how to extract them and where to put these data inside Data Extractor tables, ready to be exported and transferred to a database.
There is a comprehensive list of data analysis applications for the Mac here.
Stardrop Review
I’ve just posted a review of Stardrop an application from Optibrium that is designed to aid decision making for scientists involved in drug discovery.
Vortex script for MayaChemTools
I’ve just added a new Vortex script, this one uses a PERL script that is part of the excellent MayaChemTools.
Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using filter-it
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
Scripting Vortex 5 Calculating similarities using OpenBabel
Scripting Vortex 6 Filtering compounds
Scripting Vortex 7 Using MayaChemTools
Added JTreeView to data analysis tools
Dotmatics LinkedIn Group
Thos who use LinkedIn might be interested to see that Dotmatics now have a dedicated group.
http://www.linkedin.com/groups/Dotmatics-4327915?
I wrote a review of the Dotmatics tools a while back and have written a series of scripts for Vortex.
MagicPlot
I’ve just added MagicPlot to the list of data analysis tools.
MagicPlot looks like a useful plotting/fitting tool that is free for students.
- Publication-quality customizable X-Y plots with multiple axes
- Handy nonlinear fitting
- Visual multi-peak fitting
- Powerful text table import dialog with plot preview
- Data manipulation
- FFT, integration, differentiation, histograms, descriptive statistics (Pro)
- Auto recalculation on data change (Pro)
- Batch Processing without programming (Pro)
- Plot scale navigation with mouse
- Plot style templates (Pro)
- Multi-level undo/redo with history
Scripting Vortex 6
I’ve just added another Vortex script. In this script we will make use of the ability of filter-it to categorise input molecules into 1) a set of molecules that fulfil all criteria as defined in the filter definition file (passed molecules), and 2) a set of molecules that do not fulfil at least one of the defined filter criteria (failed molecules). The filter file defines the criteria for acceptable calculated phisicochemical properties and also any substructures that should be included or excluded during the filtering. The filter file is a simple text file that users can define for themselves, there is a detailed explanation on the silicos-it website. They also provide several example filters “Leadlike”, “Druglike”, “CMCLike” and “Clean” which cleans up a file without imposing a “drug like” filter. It should be relatively straight-forward for users to create their own filters, one could imagine a rule-of-3 filter that might be used in fragment-based screening approaches, or a toxicphore filter based on SMARTS shown to be implicated in a specific toxicity. It might also be possible to define project specific filters if a project requires a specific profile. If you need help it might be worth contacting Silicos-it.
VVI Graph SDK
VVI® today announced the availability of Vvidget Code, its Graph SDK for iPhone, iPad and Mac, version 10.7.6, bringing the following improvements:
The features are extensive and shown by the Graph app on the iTunes and Mac App Store. All the graphs in those applications are now available in the new version. See the links: Graph for iPhone, iPad and iPod touch and Graph for Mac to install those applications and test the Vvidget Code Graph SDK.
iOS:Chart
A chart & graph library for iOS and Mac OS X developers.
- Fully native Objective-C library for direct, easy use in any iOS XCode project.
- Several samples and demo projects to make integration and getting started a snap.
- Over 50 powerful graph types, including bar, line, area, pie, scatter, bubble and waterfall.
- An easy-to-use yet powerful object oriented API gives you full control over your charts with a minimum of effort.
- Real 3D graphs with controls to zoom, pan, rotate and skew!
- Adjust and control every element on every chart. Multiple Y-axis, depth effects, reference lines, scale controls and much more.
- The full power of the PGSDK (charting library of choice for MicroStrategy, IBM/Cognos and many more) now for your mobile application!
Scripting Vortex 5
I’ve just posted the latest tutorial on scripting the chemically intelligent spreadsheet application Vortex, this tutorial shows how to use OpenBabel to provide similarity searching.
The full list of Vortex scripting tutorials are shown below.
Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using Sieve
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
Scripting Vortex 5 Calculating similarities using OpenBabel
More hints and tutorials can be found here.
Scripting Vortex
This is the fourth tutorial on scripting Vortex a chemically intelligent data visualisation package. In the previous tutorials we have looked at getting data from OpenBabel, sieve, and cxcalc in this tutorial we will be using MOE as the compute engine. MOE from Chemical Computing Group is probably best known as a graphical user interface to a suite of computational chemistry tools, whilst this is indubitably the means by which many users will interact with the program it is worth finding out about the command-line tools that are available. These tools are often accessed by pipeline tools such as Knime to allow rapid processing of large files. CCG provides four very useful command-line tools in particular sddesc allows the calculation of some or all of the MOE molecular descriptors for each molecular entry.
The Vortex Scripts
Scripting Vortex Using OpenBabel
Scripting Vortex 2 Using Sieve
Scripting Votrex 3 Using cxcalc
Scripting Vortex 4 Using MOE
DataWrangler
You might also want to look at Data Wrangler for an online tool for cleaning up data.
There is a comprehensive list of data analysis packages that run under Mac OSX here
Graph version 10.7.3 available
VVI® today announced the availability of Graph version 10.7.3 on the Mac App Store, bringing the following improvements:
- Copy paste is now implemented for textual (labels) table cells and columns.
- Column paste now accepts many number delimiters such as blank, comma, tab, Return, etc.
- Added Save and Open Panels to export and import data into tables.
- Added a main title to the pie chart.
- Arrow keys now move the table cell editor to the expected adjacent cell instead of move the text cursor.
- When a sheet is present, ESC and Command-. shortcuts dismiss (cancels) the sheet.
- When the cell editor is used to enter an empty value in the last row of a column then that column length is reduced by one except when the data needs to be rectangular (as in the Z-Values table).
- ESC dismiss (cancels) the cell editor without entering the data.
- Made many small adjustments to the user interface to make it look and perform better.
- Implemented elemental table behaviour in the backend.
There is a list of data analysis applications here
A Review of Data Creator
I’m occasionally asked for is a test data set that can be used to evaluate an application. Whilst I keep a couple of data sets that I can use perhaps Data Creator will provide a more comprehensive solution.
Data Creator is an application that has been designed to fill this important niche, Data Creator can be used to build very large data sets using field types defined by the user and then filled with random realistic content. I’ve just added a review of Data Creator.
There is an increasing collection of software reviews here.
KNIME User Group Meeting
I just got this message:-
“Following our very successful user meeting and workshops in 2011, we will be holding a similar event in 2012. The 5th KNIME Workshop and Users Meeting will take place between January 30 and February 3, 2012 at Technopark in Zurich, Switzerland. Early bird registration closes on Jan 15th. You can register here .
There is a KNIME tutorial here.
Data Creator
I’ve compiled a list of data analysis tools and sometimes when I’m just trying a new application out I need a set of random data. Data Creator looks like it might be ideal for those sort of occasions, it can create structured data table (fields) and fill them with random proper content (records) with a single click. These data can be saved on disk and imported into databases and applications for test and demonstration purposes. Data Creator can be used to create very large data sets (thousands and thousands of records) for stress test of structures and scripts.
Added Solo to list of Data Analysis tools
Solo software equips users to perform PLS, PCA and many other multivariate analyses in a stand-alone, point-and-click environment.
Key Features:
- Data Exploration and Pattern Recognition (Principal Components Analysis (PCA), Parallel Factor Analysis (PARAFAC), Multiway PCA...)
- Classification (SIMCA, k-nearest neighbors, PLS Discriminant Analysis, Support Vector Machine Classification, Clustering (HCA)...)
- Linear and Non-Linear Regression (PLS, Principal Components Regression (PCR), Multiple Linear Regression (MLR), Classical Least Squares (CLS), Support Vector Machine Regression, N-way PLS, Locally Weighted Regression...)
- Self-modeling Curve Resolution, Pure Variable Methods (Multivariate Curve Resolution (MCR), Purity (compare to SIMPLSMA), CODA_DW, CompareLCMS...)
- Curve fitting and Distribution fitting and analysis tools Instrument Standardization (Piece-wise Direct, Windowed Picewise, OSC, Generalized Least Squares Preprocessing...)
- Advanced Graphical Data Set Editing and Visualization Tools
- Advanced Customizable Order-Specific Preprocessing (Centering, Scaling, Smoothing, Derivatizing, Transformations, Baselining...) Missing Data Support (SVD and NIPALS)
- Variable Selection (Genetic algorithms, IPLS, Selectivity, VIP...)
There is a listing of data analysis tools for Mac OS X here.
Using Calculation Fields in Vortex
Whilst Vortex has tools that allow you to do some analysis and of course you can use the scripting facility to access statistical or model building packages like R in this tutorial we will be using a model taken from the literature and implementing it within Vortex using a calculation field to construct the algorithm.
KnowledgeMiner (yX) for Excel 2.9.1 update
Self-organizing, Parallel, High-Dimensional Modeling now for Excel 2011! New Features of KnowledgeMiner (yX) for Excel version 2.9.1 [New] Improved Charting. Now displays actual vs. predicted data on both predicted and learning data. [update] Tutorial updated.
There is a list of data analysis and plotting tools for Mac OS X here.
Aabel Updated
Aabel v3.0.6 complimentary update is optimized for the newest Mac OS X version, Lion. In addition to optimizing Aabel v3 for Lion, this update also includes fixes for all known bugs and glitches that have been discovered up-to-date. The performance of Aabel v3 has been enhanced on Lion; this is in particular noticeable on machines with modern graphic cards. The updater that is downloaded from this page can be used on both Snow Leopard and Lion, but the performance-related enhanced aspects of the application are Lion-specific.
There is a comprehensive list of data analysis packages that run under Mac OSX here
Scripting Vortex 3
ChemAxon's Calculator (cxcalc) is a really useful command line program in Marvin Beans and JChem that performs chemical calculations using calculator plugins. There are a lot of calculations provided by ChemAxon (e.g. charge, pKa, logP, logD), and others can be added by writing custom plugins, perhaps one of the most useful is the ability to calculate the acidic and basic pKa. Calculation of pKa is essential to get a reasonable hold on the LogD of a molecule. LogD is probably the most critical physicochemical property in drug discovery, it has a major influence on absorption, cell penetration, metabolism, CYP450 inhibition and induction, PGP transporter activity and activity at the HERG channel, and is often a critical component of any structure activity relationship.
These scripts make use of cxcalc to generate data columns in Vortex
DAQ Plot has been revised to v10.7.2
I just got this email
This is a one-time email to let you know that DAQ Plot has been revised to v10.7.2 with these new features:
- Runs on Mac OS X 10.6 (Snow Leopard) and 10.7 (Lion).
- Implements up to 16 y-axis time and spectral graphs.
- Implements the default microphone as an enumerated hardware unit.
- Implements direct printing.
- Implements color legend on the main window.
- Implements pop over window for data discovery.
- Includes many adjustments to the user interfaces, general bug fixes and speed improvements.
As it turns out, many customers are delaying upgrading to Lion so this newer version supports both Lion and Snow Leopard as well as makes many significant improvements. Because of this change (the Snow Leopard version was previously EOL, but now is not) we are taking the step of this email to inform you of this current version.
For instructions on upgrading your version of DAQ Plot please email support@vvi.com.
There is a list of data analysis and plotting tools for Mac OS X here.
InfiniteGraph Added to data analysis list
mathStatica 2.5 for Mathematica 8
mathStatica 2.5 for Mathematica 8 includes a new parallel processing engine affording huge performance gains.

Timings in seconds using Mathematica 8.0.4 (latest Oct 2011 release) running on a Mac Pro computer.
For more data analysis tools look at the Data Analysis Applications page.
Vvidget Builder is now available
http://itunes.apple.com/us/app/vvidget-builder/id470597599?mt=12
You may also be interested in the movies:
Shows how to use Vvidget Builder:
http://www.vvidget.org
Shows how to program a Vvidget Code application for the iPhone using Xcode 4.2:
http://www.vvidget.org/develop
StarDrop 5.1 will be available for Mac
Vvidget Builder beta for Lion
Aabel Updated
VVI Graphing beta
KnowledgeMiner (yX) for Excel version 2.8
More data analysis tools for Mac OS X
RapidMiner, SciDAVis, LabPlot, fityk Read More...
Data Analysis tools
Mathematica 8.0
Knowledge Miner (yx) updated
DataAnalysis
Knime
KnowledgeMiner (yX) for Excel updated
Aabel Updated
The powerful data analysis and plotting tool Aabel 3 has been updated (version 3.0.4).
Read More...KNIME Desktop for Mac OS X
KnowledgeMiner (yX) for Excel
Vvidget Updated
R updated
Developers needed
StatPlus:mac Updated
Aabel Review
Maple Updated
Vortex: Cheminformatics data analysis
Solver now available
Updated Stats site
Molegro Data Modeller
Molegro Data Modeller is a cheminformatics application for Data Mining, Data Modelling, and Data Visualization.
Read More...SPSS 16 available
SPSS is one of the real heavyweights in the statistical analysis area but the Mac version lagged behind. It now looks like version 16 brings a major upgrade for Mac users Read More...
