Reaction Attempts on ChemSpider
Just as we have done with the Open Notebook Science Solubility Challenge, we are adding more structure to the UsefulChem project.
This is a little bit more difficult because the UC notebook represents mainly chemical reactions, while the ONSC data are simply solubility measurements. Since most of the UC reactions are Ugi reactions, we have been keeping summary data in the CombiUgi Google Spreadsheet, which is completely specialized for this reaction and variations in our reaction conditions. This lets us search or sort by reactant, concentration, solvent, etc. However, we cannot do substructure searching directly using the CombiUgi sheet and we cannot add other types of reactions.
In order to enable substructure searching and add other reactions, Antony Williams has created 2 new data sources in ChemSpider: Attempted Reactions - Reactants and Attempted Reactions - Products. The data represented in the CombiUgi sheet has been restructured into 2 new Google Spreadsheets: RXIDs Reaction Attempts and Reaction Attempts.
Both of these sheets use a common Reaction ID to tie together an unlimited number of reactants and products (Reaction Attempts) and other pertinent reaction conditions (RXIDs Reaction Attempts), such as the concentration of the limiting reagent, the solvent, yield, notes, etc.
Currently only the data in the Reaction Attempts sheet has been imported into ChemSpider. But this alone gives us new functionality: we can perform substructure searches for either reactants or products.
For example lets say we want to search for all reaction attempts using aromatic carboxylic acids. First we simply do a substructure search on ChemSpider drawing benzoic acid and selecting Attempted Reactions - Reactants as the Data Source.
This pulls up 8 compounds that were used as a reactant at least once.
Clicking on one of these hits brings us to the ChemSpider entry. Selecting the Syntheses tab in the Data Sources shows links to the lab notebook pages where this compound was used.
The system is configured to accept reactions with fully characterized products to reactions where products were not isolated or even reactions in progress. I'm not using the term "failed reaction" because the term has no meaning without the context of the objective of the reaction. In our Ugi reactions we are typically looking for the product to precipitate out. By our criteria, reactions where no precipitate was observed after a few days would be classified as "failed". However it may well be that product was formed but did not precipitate. Even when product is obtained, some might consider 30% isolated yields to be failures, while others would not. Context is everything in qualifying success.
But even with a clear definition of success, many reactions are simply neither successful or failures. Reactions in progress fall into that category. The student may have even completed the reaction but not yet analyzed the results. But that doesn't matter so much if the raw monitoring data has been provided.
The general structure of this database means that we can add not only our reactions but those of anybody. Even in cases where someone does not have an Open Notebook, just providing a link to contact information of the researcher could be very useful to start a conversation. In that case the system would function more as a social networking platform - connecting researchers who work on similar molecules.
I don't think people are willing to do extensive write-ups for what they consider to be "failed experiments". However, if all that is requested is the list of reactants and target products that may not be such a burden if it potentially means connecting up with another researcher who can help or even start a new collaboration.
Currently ChemSpider does not take into account the information in the RXIDs Reaction Attempts sheet but we hope to be able to make use of that at some point. That would let us do more sophisticated searches like - search for any reaction attempt where an aromatic carboxylic acid was reacted with an aliphatic amine in methanol.
Andrew Lang has also provided the information of the 2 spreadsheets as XML:
[Note: if viewing on FireFox select View Source to see all the XML]
We will likely use these live feeds for performing more sophisticated queries and we welcome others to use them for any purpose.