Saturday, December 12, 2009

First Edition of ONS Solubility Challenge Book

Andrew Lang and I have been working on a book version of the Open Notebook Science Solubility Challenge database. The timing is good since we just awarded the last ONS Challenge Submeta award this month. All of the students, judges and educational partner are included as co-authors. A biography and picture of everyone is included in the book.
Jean-Claude Bradley, Associate Professor of Chemistry at Drexel University
Cameron Neylon, Senior Scientist at the ISIS Pulsed Neutron Source, Rutherford Appleton Laboratory and Lecturer in Chemical Biology at the School of Chemistry at the University of Southampton
Rajarshi Guha, Research Scientist at the NIH Chemical Genomics Center
Antony Williams, Vice President of Strategic Development, ChemSpider at the Royal Society of Chemistry
Bill Hooker, Postdoctoral Researcher in Molecular Biology
Andrew Lang, Professor of Mathematics at Oral Roberts University
Brent Friesen, Associate Professor of Chemistry at Dominican University
and
Tim Bohinski, David Bulger, Matthew Federici, Jenny Hale, Jenna Mancinelli, Khalid Mirza, Marshall Moritz, Daniel Rein, Cedric Tchakounte, and Hai Truong
We selected LuLu as a convenient mechanism to distribute copies. This 6 x 9 inches black and white soft cover edition is available for $5.96, which just covers the printing and shipping charges. Other formats are possible - such as a larger hardcover in color - but these are much more expensive. We thought it would be good to start with the most affordable version and look at other options later. The electronic version of the book is available for free on LuLu.

We were inspired by the style of the solubility book published by Atherton Seidell in 1919, freely available on Google Books. The compound entries are listed in alphabetical order, with tables of compound data and solubilities. We included data that we found to be useful for practical applications, including predicted density, room temperature phase and the solubility in molarity, mole fraction and g/100g solvent. References link to lab notebook pages or literature references.

Andy found a way to create the fully formatted book in an almost completely automated way, pulling the data directly from the Solubilities Summary and other Google spreadsheets and querying ChemSpider. The preface and biographies of the students, judges and educational partner are also automatically pulled in from Google Docs. With this system in place, it will be straightforward to publish future editions with the most updated information frequently.

This was also a good opportunity to make use of the WebCite service. It enables us to link the book to a frozen version of the Solubilities Summary sheet archived as an Excel spreadsheet. This format retains all the formulas and hyperlinks in the original Google Spreadsheet.

The preface further explains the scope of the book and project:

The Open Notebook Science Solubility Challenge

Solubility is an important consideration for many chemistry applications. Synthetic chemists usually use a solvent to perform reactions and knowledge of the solubility of the starting materials or products can be very useful to pick an appropriate solvent. Analytical chemists can use solubility to design separation techniques and factor in dynamic range considerations. Physical chemists can create and evaluate their models of how molecules interact in the solubilization and precipitation processes.

Solubility data can be obtained from a variety of online and offline sources. As with all chemical data, it can be a challenge to evaluate reported measurements. Some databases offer no references while others provide citations to peer reviewed journal articles. Given the choice, more weight is generally given to the latter. This is reasonable in most cases because more information about the purity of compounds and the methods used are available in peer-reviewed articles.

However, the information for how a specific measurement was obtained within a journal article is not generally provided. General methods are provided but the raw data for a specific measurement are typically not published. Peer review is not intended to validate individual measurements - its function is to ensure that the authors made appropriate conclusions based on their processed datasets and the state of knowledge in the field.

The Open Notebook Science Challenge was initiated in the fall of 2008 as the result of a discussion on a train in the UK between Jean-Claude Bradley and Cameron Neylon.[1,2] The concept was very simple: create a crowdsourcing opportunity for the chemistry community to contribute solubility measurements under Open Notebook Science conditions. This method of publication entails providing immediate public access to the chemist's laboratory notebook, as well as all raw data used to compute the measurements.[3,4]

On Sept 3, 2008 the first ONSC measurements were recorded by Bradley and Neylon at the University of Southampton in Neylon's laboratory.[5] The project was soon sponsored by Submeta, offering ten $500 awards for students in the US or the UK who best recorded how they performed their experiments.[6] Furthermore, the first 3 winners also received one year subscriptions to Nature magazine, thanks to a sponsorship from the Nature Publishing Group.[7] Sigma-Aldrich supported the contest by donating chemicals upon request.[8]

Students were evaluated by a group of judges who convened once a month to deliberate the next award. Judges also provided feedback to the students by commenting on their lab notebook pages directly on the wiki. Their expertise ranged from chemistry to mathematics, spectroscopy and molecular biology.

Techniques

Participants in the ONS Challenge were not required to use a specific method to measure solubility - although they were required to properly document their experiments and analyses. Due to its simplicity, most measurements in the past year were made using the SAMS NMR technique, requiring no volume measurement or calibration curves.[9] Two assumptions are made with this method. The first is that the volume of solute and solvent are additive, with the error becoming negligible at low solubility values. The second is that NMR integration values are proportional to the amount of solvent and solute. Some deviations from this have been observed for default NMR parameters and in later experiments long relaxation times are introduced into the protocol (D1 = 50s).[10]

Data Curation

Since an Open Notebook approach is used in this work, those interested in the validity of the measurements can assess the methods used - both for the preparation of saturated solutions and the raw data from the measurements. Over time, values in the database are likely to improve and possibly some errors may be uncovered and corrected. However, on the whole, we feel that the values provided in this work should be of use to chemists trying to gain an appreciation of solubility for most applications. This is especially the case for values that are not obtainable from any other source.

When clearly erroneous data points are discovered, they are flagged in the database as "DONOTUSE". This way interfaces with the dataset can ignore these values while allowing anyone to investigate why the data points were flagged. This might happen when early experiments did not allow for sufficient mixing or NMR D1 relaxation times were long enough to fully integrate peaks of interest. Out of 681 reported measurements, 51 are currently marked in this way. A shared Google Spreadsheet is used to collect and curate the dataset. This allows easy data entry while providing a simple way to interrogate the database for visualization applications via the Google API.[11]

Literature data and format conversions

An additional 400 solubility measurements from the literature are included in the database. These generally correspond to compounds that are structurally identical or similar to the compounds measured by the ONS Challenge participants. These values are averaged in with the values from the participants, with appropriate references provided. In order to compare values, conversions from molar fraction or g solute/100g solvent to molarity were made by assuming that the volumes are additive and obtaining the density of the solutes in most cases from the predicted values in ChemSpider.[12]

For the convenience of chemists with diverse applications, all three formats are provided. For the cases where solutes are miscible with the solvent, the molarity reported is simply the solute's density. The practical interpretation of this is that solutions of any molarity below the solute's density can be prepared.

In the process of converting units and averaging heterogeneous data sources, no attempt has been made to track significant figures. Those interested in any information about the precision of measurements should consult each individual data source. This may not be an easy task for measurements only carried out once and where factors such as the quality of spectral peaks and baselines are not optimal.

This collection will be most valuable for those who do not require highly precise measurements for their applications. For example, synthetic chemists can easily use rough estimates of solubility to select appropriate solvents for a reaction. In any case, one would be wise to consider all measurements as provisional, regardless of the source. As more data are collected, subsequent editions of this book will adjust values accordingly.

Searching the database

The values in this database can be accessed and filtered in various ways. More information is available at the ONS Challenge wiki[13] and Chapter 16 of the book "Beautiful Data".[14]

Database version

Archived as Excel Spreadsheet by WebCite on December 11, 2009.[15]

References

[1] Bradley, JC Open Notebook Science Challenge, UsefulChem blog (2008) http://usefulchem.blogspot.com/2008/09/open-notebook-science-challenge.html
[2] Open Notebook Science Challenge Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science_Challenge
[3] Bradley, JC Open Notebook Science, Drexel CoAS E-Learning Blog (2006) http://drexel-coas-elearning.blogspot.com/2006/09/open-notebook-science.html
[4] Open Notebook Science Wikipedia entry http://en.wikipedia.org/wiki/Open_Notebook_Science
[5] Bradley, JC; Neylon, C UsefulChem Experiment 207 http://usefulchem.wikispaces.com/Exp207
[6] Bradley, JC Submeta Open Notebook Science Awards, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/submeta-open-notebook-science-awards.html
[7] Bradley, JC Nature Sponsors Open Notebook Science, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/11/nature-sponsors-open-notebook-science.html
[8] Bradley, JC Sigma-Aldrich First Official Sponsor of Open Notebook Science Challenge, UsefulChem Blog (2008) http://usefulchem.blogspot.com/2008/09/sigma-aldrich-first-official-sponsor-of.html
[9] Bradley, JC Semi-Automated Measurement of Solubility, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/03/semi-automated-measurement-of.html
[10] Bradley, JC NMR Integration Progress for Solubility Measurements, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/06/nmr-integration-progress-for-solubility.html
[11] Bradley, JC Interactive Visualization of ONS Solubility Data, UsefulChem Blog (2009) http://usefulchem.blogspot.com/2009/01/interactive-visualization-of-ons.html
[12] ChemSpider database http://www.chemspider.com
[13] ONS Challenge List of Experiments Page http://onschallenge.wikispaces.com/list+of+experiments
[14] Bradley, J.-C.; Guha, R.; Lang, A.S.I.D.; Lindenbaum, P; Neylon, C.; Williams, A.J. & Willighagen, E. Chapter 16: Beautifying Data in the Real World from Beautiful Data. O'Reilly Media, Eds: Segaran, T. & Hammerbacher, J. (2009)
[15] Bradley, Jean-Claude; Lang Andrew. Solubilities Summary Sheet. Open Notebook Science Challenge. 2009-12-11. URL:http://spreadsheets.google.com/pub?key=plwwufp30hfq0udnEmRD1aQ&output=xls. Accessed: 2009-12-11. (Archived by WebCite® at http://www.webcitation.org/5lx5ry3BV)


Labels: , , , , ,

Tuesday, December 01, 2009

Hai Truong is Dec09 Submeta ONS Award Winner

Hai Truong, working under the supervision of Jean-Claude Bradley at Drexel University, is the December 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Hai mainly collaborated with Khalid Mirza to try to understand co-solute effects for Ugi products in benzene. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

This was the final Submeta ONS Award for 2008-9. We would like to thank all the sponsors - Submeta, Nature Publishing Group and Sigma-Aldrich - for making this project a reality. A summary of the results from the past year will be published shortly.

For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , , , ,

Monday, August 03, 2009

Daniel Rein is Aug09 Submeta ONS Award Winner

Daniel Rein, working under the supervision of Jean-Claude Bradley at Drexel University, is the August 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Daniel used both NMR and the sequential precipitation technique to obtain solubility data. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Two more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , ,

Monday, June 01, 2009

Matthew Federici is June09 Submeta ONS Award Winner

Matthew Federici, a mechanical engineering student working under the supervision of Jean-Claude Bradley at Drexel University, is the June 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Matt has applied an NMR technique to measure solubility. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Four more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , , ,

Sunday, March 01, 2009

Cedric Tchakounte is March09 Submeta ONS Award Winner

Cedric Tchakounte, a Biological Sciences and Biotechnology undergraduate student working under the supervision of Jean-Claude Bradley at Drexel University, is the March 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Cedric is focusing on NMR techniques to measure solubility. He has also done several experiments to verify the miscibility of liquid solutes in methanol. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Six more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , ,

Wednesday, February 25, 2009

ONS Solubility Challenge on Sigma-Aldrich tech sheet

The Open Notebook Science Solubility Challenge has made it onto the Solubility Information for Products technical sheet of Sigma-Aldrich:
Sigma-Aldrich are currently collaborating with Nature and Submeta in an Open Notebook Science Challenge. This is a project aimed at facilitating the generation of solubility information for chemicals. Please follow this link 'Open Notebook Science Challenge - Solubility' for more information on this project.
This is a great example of how synergies between academia, industry, publishers and foundations can take place quickly and in the open.

There is some other interesting information on that page - notably a table translating written descriptions of solubility terminology - like "freely soluble" - into numbers.

Labels: , , , ,

Monday, February 02, 2009

David Bulger is Feb09 Submeta ONS Award Winner

David Bulger, a chemistry undergraduate student working under the supervision of Robert Stewart at Oral Roberts University, is the February 2009 Submeta Open Notebook Science Challenge Award winner.

In addition to a cash prize from Submeta, he will receive a one-year subscription to Nature magazine. David is focusing on NMR techniques to measure solubility. See his experiments here:
http://onschallenge.wikispaces.com/list+of+experiments

Seven more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , , ,

Sunday, January 04, 2009

Jan 2009 Submeta Open Notebook Science Award Winner Announced

Khalid Mirza, a Ph.D. student with Jean-Claude Bradley at Drexel University is the January 2009 winner of the Submeta Open Notebook Science Challenge Award, which includes a one year subscription to Nature magazine and a cash prize. Khalid's contributions included the measurement of non-aqueous solubilities using both evaporation and UV-vis techniques:
http://onschallenge.wikispaces.com/list+of+experiments

Eight more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , , , , ,

Wednesday, November 26, 2008

First Submeta Open Notebook Science Award Winner

Jenny Hale, a Ph.D. student with Cameron Neylon at the University of Southampton, is the first of ten recipients of the Open Notebook Science Challenge Awards for December 2008. Open to students from the US and the UK who report their solubility measurements publicly as they work, the ONS Challenge Awards consist of a cash prize from Submeta and a one-year subscription to Nature magazine. Jean-Claude Bradley, Associate Professor of Chemistry at Drexel University, manages the award.

For more information see:
http://onschallenge.wikispaces.com
http://onschallenge.wikispaces.com/submetaawards08

Labels: , , , , ,

Thursday, November 20, 2008

Nature Sponsors Open Notebook Science Challenge

I'm pleased to announce that the Nature Publishing Group will provide one year subscriptions of the Nature journal to the first three Submeta Open Notebook Science Award winners. The first award is expected to be announced December 1, 2008. The Open Notebook Science Challenge is an open call to crowdsource solubility measurements in non-aqueous solvents. Participating students from the US and the UK who meet eligibility criteria are welcome to apply for one of ten Submeta ONS Awards.

Labels: , , , ,

Tuesday, November 04, 2008

Submeta Open Notebook Science Awards!

I am proud to announce that submeta is sponsoring TEN $500 (USD) Open Notebook Science awards as part of the ONS challenge to measure the solubility of compounds in non-aqueous solvents. Submeta follows Sigma-Aldrich as a sponsor for the project. Drexel University is managing the award distribution.

The idea was to make this available as widely as possible. However, because of legal issues, we were not able to make this open to everyone - only students from the US and UK are eligible. (see here for the complete rules).

I will be posting the list of judges shortly on the ONSchallenge wiki. They will be judging the process as much as the product and hopefully enable us to have a true ongoing peer-reviewed Open Notebook in chemistry.

This all came about from a FriendFeed conversation started by Bill Hooker a few months ago.

Labels: , , ,

Creative Commons Attribution Share-Alike 2.5 License