Friday, November 21, 2008

What is the Solubility of Vanillin in Methanol?

If we've learned anything in the past few months, we've learned that measuring solubility is really tricky.

The Open Notebook Science Challenge has generated 11 answers so far for the solubility of vanillin in methanol:
Rajarshi Guha has provided an extremely handy web query interface (must use FireFox) to generate these plots. It taps into live data from this GoogleSpreadsheet and links back to the specific experiments that generated the data.

Because we have access to the lab notebook pages, these measurements are not equal. Some of the measurements are based on reports where conditions that later turned out to be important were not reported or controlled. As we learn more about what is important many of these measurements will probably be removed and replaced with more reliable data.

But in the meantime, we're going to use the best possible estimate of the property that we have available. It lets Rajarshi feed his solubility models and gives us a tight iteration cycle between prediction and experiment. For this purpose, the average value of 3.5 M for the 11 measurements is probably good enough to be part of a training set to allow a rough prediction of solubility. As we get more confident over time we'll improve the model.

Right now, we're not quite ready to do predictions but we should be there soon. The main feedback we're getting now is which compounds we need to focus on to get to that minimum training set (Rajarshi says 50 compounds/solvent and we have about half that number for some solvents). It looks like we'll focus on aromatic aldehydes and aromatic carboxylic acids, mainly because many don't evaporate easily in the SpeedVac (one of the control parameters discussed earlier).

Another advantage of aromatics is that we can use UV spectroscopy to determine solubility without using evaporation. Hopefully in the coming weeks this will confirm what Jenny Hale has concluded today in ONSC-EXP011:
The results of the calculations give the solubility of vanillin in ethanol as 2.48 M and vanillin in methanol as 4.15 M. This finally gives excellent correlation with exp207, which measured the solubilities as 2.5 and 4.19 M respectively.
It appears that some compounds require significant time and agitation to reach saturation. In this last experiment Jenny carefully recorded what happens over the course of adding vanillin to methanol and periodically vortexing. Inspection of her log shows several points where someone might have assessed the solution to be saturated when it was just slow to dissolve. It also makes a case for always wearing safety goggles in the lab :)

At this point I am becoming more convinced that the solubility of vanillin in methanol is closer to 4.2 M. If that result is consistently obtained by other students and other methods (such as UV) using prolonged mixing times then we'll remove from the SolubilitiesSum spreadsheet the measurements that were obtained from experiments where the mixing time was less or simply not reported.

This evolution of this project also demonstrates the value of the ongoing open peer-review of an open lab notebook. The judges for the ONS challenge have provided feedback about future experiments, questioned assertions, pointed out omissions and suggested additional ways of thinking about the experiments. The contributions from the judges shows up in bold in the notebook pages and can be tracked over time by looking at the wiki page history.

Labels: , ,

Thursday, November 20, 2008

Nature Sponsors Open Notebook Science Challenge

I'm pleased to announce that the Nature Publishing Group will provide one year subscriptions of the Nature journal to the first three Submeta Open Notebook Science Award winners. The first award is expected to be announced December 1, 2008. The Open Notebook Science Challenge is an open call to crowdsource solubility measurements in non-aqueous solvents. Participating students from the US and the UK who meet eligibility criteria are welcome to apply for one of ten Submeta ONS Awards.

Labels: , , , ,

Sunday, September 28, 2008

Open Notebook Science Challenge

The wiki for the Open Notebook Science Challenge that I proposed during my UK trip is now available. We are currently looking for sponsors and participants.

Open Notebook Science Challenge

What?

The first round of this challenge calls upon groups or individuals with access to materials and equipment to measure the solubility of compounds in organic solvents and report their findings using Open Notebook Science .

Why?

Understanding exactly how an experiment was performed is essential to the efficient progress of science. There are no absolute facts in the scientific literature; every measurement reported is only meaningful within the full context of how it was generated. The purpose of a laboratory notebook is to report as much of this context as is reasonable. But to find trends data must be abstracted to a level where they can be manipulated in tables and charts. This is not a problem as long as one can drill down from each data point in a chart to the full context found in the laboratory notebook.

For example, a Google search for "vanillin solubility in THF" pulls up a lab book page EXP207 where it is reported to be 3.89M. This number might be used in a table of someone trying to quantify trends or test a mathematical model, in which case reliability of the number is important. By reading the lab notebook page it becomes clear that 118.5 mg of solid was measured on a scale with 0.1 mg accuracy. However only one measurement was obtained. All kinds of other details which might be important are provided, for example how long the mixture was vortexed, at what temperature and the physical appearance after evaporation. If this number turns out to be an outlier, one can investigate if a calculation error was the cause by inspecting the linked spreadsheet.

However, if a researcher is simply looking for the feasibility of making up a 2M solution of vanillin in THF for a reaction the margin of acceptable error is so wide that the answer is almost certainly "yes".

The purpose of Open Notebook Science is to allow immediate communication of scientific results. The value of these results will depend upon the quality of the laboratory notebook and the linked raw data. Publication in peer-reviewed journals is still an extremely important part of this process but it is not an appropriate vehicle for the efficient communication of this type of information.

In fact, one of the motivations for participating in this project is that we will collect data from sufficiently well recorded experiments and publish them in a peer-reviewed journal with the participation of the researchers as co-authors. We aim to build a mathematical model to predict solubility using the results obtained from this project.

Who?

Organizers

Jean-Claude Bradley
Cameron Neylon
Rajarshi Guha (modeling)

How?

Simply request an account on this wiki and start recording experiments using a format similar to UC-EXP207 . The organizers will provide feedback in the form of comments in bold and italics directly on the wiki. Hitting the Recent Changes link on the left navigation bar is a good way to keep track of edits.

When?

Now!

Labels: , , , ,

Creative Commons Attribution Share-Alike 2.5 License