I was at the Dotmatics User Group Meeting recently and I thought I’d post a few thoughts and comments.
Dotmatics are a relatively small (but growing) scientific informatics company that was formed as a spinout from a major pharmaceutical company. I’ve known the founders as colleagues and friends for many years and followed their progress with interest. They now provide informatics solutions for over 50 clients around the world.
I’ve written a review of Vortex previously and I thought I’d focus on a few of their other products in this summary.
One of the most difficult tasks on any major project is keeping track of information, particularly if the data is generated by different labs often in varied geographical regions, Gateway is a web-based project/document management tool that allows scientist to store and share information. This can be the latest biological results, presentations and minutes from meetings, notes from conferences, details of competitors activities, links to literature etc. The real beauty is that everything is indexed and can be searchable. If you have ever spent days searching through shared folders or worse someone’s hard drive trying to find a critical report than I think you can see how valuable this can be. Whilst there are other Enterprise level project management tools many require specific software on your machines (and often don’t support Macs), Gateway like the majority of the Dotmatics software requires nothing more than a web browser, and thus runs on most platforms even your iPhone or iPad!
Talking to a few of the users one of the attractions of Gateway is that they can securely share information with third parties working on a project, no need to install anything on the clients just send them a URL and a password. With companies increasingly collaborating on projects this seems to be very useful.
Whilst Gateway provides a document centric view of a project, Browser is a web-based tool for querying and browsing small and large biological and chemical databases. Many people will be familiar with ISISbase type solutions or Excel spreadsheets, however there are significant limitations with these solutions. Excel while the major workhorse for accountants was never designed to support structure-based searching and all the solutions I’ve seen always seem to break after any updates and become dreadfully slow as soon as any reasonably sized data-set is involved. Desktop clients are usually difficult to customize or modify and often don’t support the sort of functionality that the current internet brought up scientists expect to have, such as popups, display of graphs, movies etc. In addition from a Mac user point of view they rarely offer good Mac compatibility. By focussing their effort on web-based tools Dotmatics can leverage the rapidly evolving web technologies, and because you don’t have to install anything on clients desktops there is a huge IT overhead saving. By sticking to common standards they can support all major browsers and multiple platforms.
All the data is stored in a SQL database (Oracle, SQLserver, MYSQL etc..) and Dotmatics provide Nucleus a web-based tool for importing, mapping and storing data from existing sources. Excel, SDF, XML, CSV, TSV, sequence data, database data etc. Alternatively Browser can be integrated with existing solutions. It is then possible to use the power of the SQL database to index and provide really, really fast searching. Searching supports Boolean logic allowing the construction of complex queries.
Whilst text-based searching is nothing unusual, the difficulty comes with structure-based searching, for this an Oracle chemical cartridge is used. Browser Integrates with any chemical cartridge so you can take advantage of any existing licenses or preferred methodologies: Chemaxon JChem, Accelrys Accord, ISISDirect, SymyxDirect, IDBS Chemistry, CambridgeSoft Enterprise/Bioassay, but Dotmatics provide their own Oracle cartridge, Pinpoint. Performance is excellent, indexing more than 3 million compounds in just 700 seconds on a single desktop CPU. Substructure searches in sub-second time
The search results can be displayed as a form or table, forms can be project-based or users can use a form building tool to create their own customised versions. The form building is very a very simple drag and drop interface which is a joy to use when compared the struggles I’ve had building ISISbase interfaces. At any point you can have pop-up windows that display extra data such as dose-response curves or PK studies.
When viewing a spreadsheet layout of data it is very easy to pivot data on the fly to create summarised views.
Compound registration is achieved using Register a web-based tool for single and batch compound registration. Additionally register can be used to manage and track other assets such as QC data, sample location and amounts, notebooks, amongst others. They use InChi codes to search for duplicates during registration at “Parent”, “Batch”, “Sample” level, were “Batch” would include different salt forms and “Sample” would refer to repeat preparations. All of which can be modified to accommodate internal business rules.
Whilst there are many desktop tools for the capture and analysis of biological data, anyone who has been on a project generating large amounts of data realizes very soon that a specialist tools are required for analysis and sharing information. Studies is a web-based tool for study design, management, and data analysis. Studies can accommodate a wide variety of study types, including, Screening - plate and non-plate-based assays, generating % inhibition, IC50/EC50, Ki, user-defined analyses. Studies can also be used to capture and store images from image analysis or in vivo experiments, by defining document templates to be completed for each study. All data is stored in an Oracle database and is instantly searchable. The beauty of a web-based tool is that it can be accessed for different locations or even from a mobile device. There are very nice plate-mapping tools and a variety of data analysis and curve fitting protocols available and the scientist can choose pass/fail criteria at all stages to allow counter-validation before publishing. Studies also supports in vivo and DMPK studies and you can save regularly used protocols.
At the moment whilst they support structure-based searching, including the role in a reaction (reactant or product) they do not as yet support reaction searching. In general this requires the user to map every reaction and it is not clear that users would be willing to do this. Reaction set-up is very straight-forward with automatic molecular property calculations, or lookups for registered reagents, stoichiometry is calculated automatically. In conversation they also said they could accommodate internal business rules with respect to hazard assessment. The experimental can be typed in or constructed using potted fragments of standard protocols or work-ups. Spectrascopic information can be attached to the page as an image.
IP is secured - when completing a page an encrypted PDF version is captured and stored on the server.
One nice feature from the suite is the ability to manage experimental request, so a chemist might ask for a binding assay to be done on a compound or the project director can use it for selecting compounds for ADME studies. These request then appear for the relevant scientist to do the study, who can then schedule and prioritise them.
Vortex is a chemically aware data analysis tool that can import files in a variety of formats, it provides depiction and structure based searching, together with property calculations tightly integrated with excellent charting and analysis tools. Whilst there are a number of statitistics and data analysis applications available for MacOSX (see this table) none have any chemical intelligence, in particlular the ability to render chemical structures, the ability to calculate chemical and physiochemical properties based on the structures, and most importantly the ability to search based on chemical structure or sub-structure.
You can then explore the data using a variety of display types ranging from simple scatter plots, bar charts, or histograms right through to complex 3D plots. All plots are interlinked so, highlighting points on one plot automatically highlights the same points in other plots and in the spreadsheet. Scatter plots in 2D or 3D can be annotated with structural depiction, regression lines (linear, EC50 etc), background images, group lines etc. Data points can be tagged with colour, labels, size, transparency, symbol and have mouse over events such as display URLs attached. You can also add a grid view of molecules to enable easier comparison of chemical structures. Vortex also supports scripting and Dotmatics provide examples of linking to R for statistical analysis.
There were also presentations from Takeda, Heptares Pharmaceuticals, Arrow Therapeutics and GSK. One of the striking things from these presentations was apparent ease with which the Dotmatics tools were integrated with other vendors databases and with existing in house tools and it seems without much support from the internal IT depts.
I’m sure there some who would prefer a desktop client, but I’ve seen laptops lost/stolen, personal backups fail (if done at all), issues with OS and application updates and problems sharing common data. Once you have more than a handful of people working together a web-based solution is very attractive. In addition from a Mac user viewpoint the Dotmatics solution provides an enterprise quality solution in which all platforms are on an equal standing, and support for iPhone, iPad etc is built in.