Friday, April 30, 2010

NMR integration web service expanded

The ONS Challenge has extensively used a web service created by Andrew Lang to automatically calculate solubility from NMR spectra. One of the constraints of the service was that the JCAMP-DX file had to be deposited in a special folder on a server at Drexel.

Andy has now modified the script so that the JCAMP-DX file can be located anywhere on the internet. I have prepared a modified Google Spreadsheet to serve as a template for SAMS calculations (Semi-Automated Measurement of Solubility). Simply enter the url to the JCAMP-DX file in the appropriate column and fill in the ppm ranges and corresponding hydrogen numbers for the solvent and solute, and molecular weight and density data. (The predicted density of solids can be found on Chemspider). The concentration of the solute will then be automatically calculated based on an assumption of volume additivity.

The web service (which handles baseline correction) could be used for any other purpose involving the integration of spectra. Just make a copy of the Google Spreadsheet and modify.

Note that the JCAMP-DX files must be in XY format. If your instrument saves spectra in a compressed format they must be converted to XY. The desktop version of Robert Lancashire's JSpecView can be used to carry out the conversion.

This template spreadsheet also features a service in a cell to display the NMR spectrum by simply clicking on the link inside the cell. This is very handy because it obviates the need to create an HTML file which must normally accompany the JCAMP-DX file for viewing. Being able to quickly view a spectrum from a particular row within the Google Spreadsheet makes tracking data provenance very intuitive and errors easy to spot.

Labels: , , ,

Wednesday, April 28, 2010

Reaction Attempts Book Edition 1 and UsefulChem Archive

I am pleased to report that Andrew Lang and I have published the first edition of the Reaction Attempts book. It currently contains most of the Ugi reactions from the UsefulChem project and is associated with an April 27, 2010 snapshot archive of the entire UsefulChem project, including NMR spectra, spreadsheets, images and the entire lab notebook from Wikispaces.


At 582 pages the printing cost from LuLu amounts to $26.28. Not meant to replace electronic searches, it should prove to be a handy reference book for the lab to quickly browse through what was attempted for a given reactant, what the outcome was and the researcher involved.

We are hoping to include reaction attempts from other groups in future editions. More details can be found in the preface, reproduced below:

Reaction Attempts First Edition

Data Source: the UsefulChem project

Introduction

Open Notebook Science (ONS) refers to the practice of making the full contents of a laboratory notebook and all associated raw data files available in near real time.[1] This represents an opportunity for everyone to benefit from work in progress in an open research group. However, in order to make use of the information, it must be easily discoverable. A simple strategy to increase discoverability is redundancy over multiple communication platforms.

In another project - the Open Notebook Science Solubility Challenge[2] - we published non-aqueous solubility data in the form of physical and downloadable (PDF) books.[3] Although it is possible to search the solubility database using web query interfaces, exploration of a Google Spreadsheet, an XML feed, etc.[4], having a physical copy in the laboratory has proved to be very convenient in several instances. A similar format for reactions will also be useful.

The UsefulChem Project

UsefulChem started in 2005 as an organic chemistry Open Notebook Science project with a main goal of discovering new anti-malarial agents that can be prepared by simple and cheap syntheses.[5] Most of the reactions on UsefuChem are Ugi reactions, which involve the mixing of an amine, aldehyde, carboxylic acid and isonitrile in a solvent at room temperature generally for a few hours to days.[6] The multicomponent design of the Ugi reaction and the simple reaction conditions make it ideal for exploring large virtual libraries and selecting compounds of interest to make.[7]

Isolation of the Ugi products can be immensely simpler, cheaper and readily scalable if they precipitate in pure form from the reaction mixture. To this end, much of the research in the UsefulChem project focuses on reaction conditions that lead to this outcome.[8] This is in fact the origin of the ONS Solubility Challenge discussed above.[9]

The Reaction Attempts Database

In order to look for patterns in the reaction conditions which led to Ugi product precipitation, the CombiUgiResults Google Spreadsheet was set up.[10] Reactions indexed there can be sorted by precipitation outcome, solvent, reactant, concentration, etc. and links to the laboratory notebook pages can be followed for full details. However, this sheet is designed specifically for Ugi reactions and contains columns specifically for the aldehyde, amine, carboxylic acid and isonitrile.

In order to enable the tracking of other types of reactions, the information in the CombiUgiResults sheet was reformatted into two other sheets: ReactionAttempts[11] (containing reagents and reactants) and RXIDsReactionAttempts[12] (containing reaction conditions and results, such as solvent, concentration of limiting reactant, appearance of a precipitate, yield, etc.). The two sheets are connected via the use of a common ReactionID. This format permits the representation of any type of reaction, with an unlimited number of reactants and products.[13]

By definition, any Open Notebook Science project in a work in progress. The listing of a reaction in this database only means that the researcher attempted or is in the process of attempting it. Whatever the situation, a link to the laboratory notebook page is provided, where the most recent information is available. The philosophy used here is that partial information is always better than no information at all. Thus a researcher investigating the prior use a particular reactant in a Ugi reaction might find the report that a precipitate was obtained in methanol helpful for designing their own reactions, even if the characterization of the precipitate is still pending. At the very least, knowing that a certain researcher has at least attempted a similar reaction is enough information for initiating a discussion, which may lead to valuable insights.

Reaction Attempts on Chemspider

Although SMILES[14] are provided in the spreadsheets, the primary key to identify compounds is the ChemSpider ID (CSID)[15]. This allows us to render molecule images in the book automatically. In the case of the ONS Solubility Challenge book[3], use of the CSID enables a convenient way to calculate various descriptors for displaying values in the book.

In addition, the compounds in the Reaction Attempts database are indexed on ChemSpider as two Data Sources: ReactantsAttemptedReactions and ProductsAttemptedReactions[13]. In this way a substructure search for either reactants or products will identify indexed molecules. Clicking on the Syntheses tab in the ChemSpider record for a selected molecule will then reveal a list of hyperlinks to the relevant laboratory notebook pages.

Organization of the Book

In keeping with the layout of the ONS Solubility Challenge Book, the reactants are listed in alphabetical order. Each entry displays the list of reactions where the reactant was used. This includes a scheme with all reactants and product as well as key metadata: the researcher, reaction type, solvent, limiting reactant concentration, observation of a precipitate, comments and a reference (links to the laboratory notebook page).

In this edition, only Ugi reactions are included. The reaction schemes are laid out in the following order: carboxylic acid, amine, aldehyde and isonitrile. This should allow for easy comparison between schemes within a given record. Reactions where the Ugi product was isolated and characterized are marked with a green check and the percent yield is noted. Since the Ugi products do not have simple common names, they are not included as separate entries. However, all reactions where the synthesis of a specific Ugi product was attempted can be found by looking up the entries for any of the four reactants.

Although this compilation is not exhaustive, it does cover the vast majority of reactions in the UsefulChem project to date. Future editions will include other reactions from UsefulChem and other sources.

Archive

This edition is linked to the UsefulChem data archive (ZIP)[16], (DVD)[17] and interactive hosted archive format[18], ReactionAttempts (XLS)[19] and RXIDsReactionAttempts(XLS)[20] taken on 2010-04-27.

References

1. Open Notebook Science Wikipedia Entry http://en.wikipedia.org/wiki/Open_Notebook_Science
2. Open Notebook Science Solubility Challenge Wiki http://onschallenge.wikispaces.com
3. Bradley, J.-C. First Edition of ONS Solubility Challenge Book UsefulChem Blog (2009)
http://usefulchem.blogspot.com/2009/12/first-edition-of-ons-solubility.html
4. Open Notebook Science Solubility Challenge List of Experiments page http://onschallenge.wikispaces.com/list+of+experiments
5. UsefulChem Wiki http://usefulchem.wikispaces.com
6. Ugi Reaction Wikipedia Entry http://en.wikipedia.org/wiki/Ugi_reaction
7. Dömling, A., & Ugi, I. (2000). Multicomponent Reactions with Isocyanides. Angewandte Chemie International English Edition, 39(18), 3168-3210. http://www3.interscience.wiley.com/journal/73500473/abstract.
8. UsefulChem List of Experiments http://usefulchem.wikispaces.com/All+Reactions
9. Bradley, J.-C. Open Notebook Science Challenge UsefulChem Blog (2008)
http://usefulchem.blogspot.com/2008/09/open-notebook-science-challenge.html
10. CombiUgiResults Google Spreadsheet http://spreadsheets.google.com/ccc?key=plwwufp30hfpUERhse9y5Kw
11. ReactionAttempts Google Spreadsheet
http://spreadsheets.google.com/ccc?key=0Ak1R8T6wt4YQdG9NejNLcDNUMkVBVURGM01TR0NxdXc
12. RXIDsReactionAttempts Google Spreadsheet
http://spreadsheets.google.com/ccc?key=0Ak1R8T6wt4YQdGVENVFMWjdzaGd2REJTTnA4RG5vblE
13. Bradley, J.-C. Reaction Attempts on ChemSpider UsefulChem Blog (2010)
http://usefulchem.blogspot.com/2010/03/reaction-attempts-on-chemspider.html
14. SMILES Wikipedia Entry http://en.wikipedia.org/wiki/Simplified_molecular_input_line_entry_specification
15. ChemSpider Web Site http://www.chemspider.com/
16. UC archive Drexel server (ZIP) http://showme.physics.drexel.edu/usefulchem/archives/usefulchem2010-04-27.zip
17. UC archive on lulu.com (DVD) http://www.lulu.com/product/dvd/usefulchem-archive/10791847
18. UC interactive hosted format http://showme.physics.drexel.edu/usefulchem/archives/usefulchem2010-04-27/All%20Reactions.html
19. Bradley, J.-C.; Lang, A.. Reaction Attempts Reactants and Products. UsefulChem. 2010-04-27.
(Archived by WebCite® at http://www.webcitation.org/5pIsFEbT9)
20. Bradley, J.-C.; Lang, A.. Reaction Attempts RXIDs. UsefulChem. 2010-04-27.
(Archived by WebCite® at http://www.webcitation.org/5pIs2eh62)

Labels: , , , , ,

Tuesday, April 20, 2010

ONS Books Wiki

I recently reported on our use of Nature Precedings to archive different editions of the ONS Solubility Challenge book. One of the advantages is that Precedings automatically alerts visitors if more recent editions exist.

However, today I learned that there is a glitch to this system: it is not possible to link individual versions on Precedings to a corresponding book edition on LuLu. That means that if you find yourself on the Nature Precedings entry and want to order the book from LuLu it isn't obvious at all how to do so.

To resolve this issue once and for all I just created a wiki page (ONSbooks.wikispaces.com) to track every edition of the book. This is actually better because I can also provide links to all the available data archives and blog posts corresponding to each edition.

This is also the page where we will keep track of every edition of other Open Notebook Science books. The next one to be published shortly is for the UsefulChem project.

Labels: , , ,

Thursday, April 08, 2010

Scientists Embrace Openness Article in Science Careers

Chelsea Wald just published an article in Science Careers: Scientists Embrace Openness (April 9, 2010). She interviewed several people in the Open Science movement including Jonathan Eisen, Steve Koch, Anthony Salvagno, Carl Boettiger and myself.

The article covers Open Notebook Science, Open Data and associated themes. I think it presents a view of the most commonly discussed advantages and disadvantages very well.

One section was particularly relevant to an issue I recently posted about - (and discussed on FriendFeed):
Open Notebook Science advocates claim that being open may protect a scientist's ideas rather than exposing them to theft. Newton's decision to conceal his findings within an anagram made it harder for him to prove priority over rival Gottfried Leibniz. Open Notebook scientists say all they need to do is point to their open notebooks to show that they had an idea or found a result first.

Labels: , ,

Tuesday, April 06, 2010

ONS t-shirts from Zazzle

Inspired by Graham Steel, I just received my t-shirt with an Open Notebook Science Logo and a picture of our crystal on the cover of our ONS Solubility Challenge book.

I was going to set up an ONS store but Zazzle does not permit zero royalties (don't see the logic there). But making up t-shirts on Zazzle is super simple - just grab a logo of your choice from the ONSclaims wiki.

Any other pic is your choice - this is the crystal from UCEXP150C


You can also order all kinds of other personalized things, including coffee cups.

Labels: , , ,

Friday, April 02, 2010

Bipolar Electrodeposition of CdS: Scientific Results in Limbo?

There has been a lot of discussion about the fear of getting "scooped" as a reason to be weary of using new scientific publication vehicles.

These conversations can be somewhat frustrating since people don't necessarily use the same definition of that term. Even dictionaries don't use the same language. For example, Dictionary.com has:
to get the better of (other publications, newscasters, etc.) by obtaining and publishing or broadcasting a news item, report, or story first: They scooped all the other dailies with the story of the election fraud.
Wiktionary has:
To learn something, especially something worthy of a news article, before (someone else). The paper across town scooped them on the City Hall scandal.
Depending on the definition used, one could argue that in the story I'm going to tell I got scooped or I did the scooping. Some people use the term to imply that a malicious act has taken place. The classic scenario is that one would blog about their research and a nefarious individual would appropriate their results and submit as their own for publication in a peer reviewed journal.

That isn't scooping - it is fraud - and I want to be clear that this is not what I am suggesting happened here.

Two months ago I was asked to review an article for the ACS journal Langmuir. Before 2005 one of my main research areas was bipolar electrodeposition and so I still get asked periodically to review papers in that field.

Not only was this paper in that area but it reported on exactly the same experimental design we had previously reported: the bipolar electrodeposition of cadmium sulfide. The solvent, reagents and substrate were different but it was the same material made by the same process.

Although 2 of our papers were listed in the references the key report was not. I noted this in my review but I was surprised to find that the paper was published without that correction. I contacted the editor of Langmuir to find out what happened. I thought perhaps the authors disagreed with some technical issue in our report.

But what actually happened is that the authors requested that the reference be included in the supplementary data section instead of the regular reference section of the article because it was not a peer-reviewed article. The editor thought that was a reasonable request and complied.

I was quite surprised by this because Langmuir - or ACS journals in general - do not have a formal policy on requiring references to be peer-reviewed. In fact, a quick search for "unpublished results" on Langmuir reveals many articles which use that as an acceptable reference.

I could understand not wanting to cite a blog post with unsubstantiated claims but the document in question is very thorough - it includes a systematic review of prior art, detailed experimental description and characterization data.

This is actually an example of a "SMIRP Knowledge Product". It is a publication device that I used to make public single experimental results from work that was recorded by my research group in the SMIRP Knowledge Management System we used at the time as a laboratory notebook.

The system was built on interlinked modules designed to produce "Knowledge Products" based on a combination of manual and automated workflows. For example, the module generating reviews of prior art was based on "Knowledge Filters" uncovering the novelty of the experiment in question by filtering precedents for relevant aspects.

In the case of "Bipolar Electrodeposition of Cadmium Sulfide onto a Tip of a Carbon Nanotube", the relevant knowledge filters were "Bipolar Electrodeposition", "Electrodeposition onto Carbon Nanotubes" and "Electrodeposition Approaches to Synthesize Cadmium Sulfide". Other modules generated the experimental description, results, discussion, conclusion and reference sections. In this way, not only could the experiment be fully documented but its context within the field could be extremely well defined in a systematic way.

With a workflow to create these knowledge products we still needed a way to communicate them. Back in 2003 options were far more limited than they are now. But luckily (or so we thought) at this time Elsevier was running the Chemistry Preprint Server. They offered a place to upload documents such as these and provided a way of citing them. We used the recommended citation format aggressively, including peer-reviewed articles such as this one from Springer.

However, attempting to access these documents today using the official links gives this as a result:
In what is probably one of the worst scientific publisher PR moves in recent memory, Elsevier broke all the hyperlinks they told their authors to use for citations. If you do some research you will find that the documents are still available from http://www.sciencedirect.com/preprintarchive but you have to register to even perform a search to find them! This requirement removes them from indexing by Google. Coupled with the broken citation links these documents are now very far removed from likely discovery.

The story would have ended there were it not for redundancy. I also uploaded copies to Drexel's institutional repository (DSpace), which are happily very well indexed by Google - and perhaps more critically - by Google Scholar. I had not fully appreciated the value of institutional repositories until I noticed that they are treated by some important databases as collections of scholarly works.

So what are the lessons for all the stakeholders?

For those who have scientific results that can be published as articles and MUST be published in ACS journals - send your manuscripts in. If you post them on your institutional repository first they may end up in limbo -they DO qualify as publications preventing you from submitting them as manuscripts to ACS journals - and they may NOT qualify as publications when you try to cite them in ACS journals.

But what about scientific result that cannot be published as manuscripts. The Knowledge Products are unlikely to be accepted by regular journals for several reasons. First they communicate only a single experimental result. Articles generally require narratives. Second, if some of them do get published in traditional journals, there will be copyright conflicts. The Knowledge Filters for the review of prior art will be identical for similar experiments. For example the Knowledge Product for the "Bipolar Electrodeposition of CdS on one Tip of a Carbon Nanotube" Will have identical prior art to the "Bipolar Electrodeposition of Cd on one Tip of a Carbon Nanotube" except for the section of the electrodeposition of Cd or CdS. And no - I don't think it is a good use of my time to move words around for every document to get around copyright issues.

Some of the Knowledge Products were incorporated into full articles when it made sense. But many, including the one under discussion here was not. So publishing this work as part of a full article was never even an option. There are so many scientific results like this that fall into that kind of limbo. Even today there are no really good publication vehicles for these types of results - besides institutional repositories. PLoS ONE might come to mind as an option but I don't think it fits their mandate to publish single experiments like this. And if they did it would be extremely expensive if they did not waive author fees every time. ChemSpider Synthetic Pages and similar initiatives might work for organic chemistry but this is materials science.

Considering all of these difficulties over the years is really the main motivation behind our migration away from a login based system like SMIRP to our adoption of Open Notebook Science based on a wikis and blogs, which are very efficiently indexed in real time by Google and thus easily discoverable without additional formatting work.

For publishers and authors, do you really think it is in your best interest to have a statement in the introduction about prior art say "To our knowledge, reports of bipolar electrodeposition of compounds have not been previously published." when a simple Google search shows that is not the case for the compound you are electrodepositing? I suppose the argument is that the term "published" is used with the technical interpretation of being "published under peer-review". It would have been better to at least make that explicit to avoid confusion. But the bottom line is that someone wanting to perform bipolar electrodeposition of cadmium sulfide will quickly find both reports and will learn two ways of doing it.

Thursday, April 01, 2010

Beer Chemistry Quiz on ChemTiles

During the Chemical Information Retrieval course I taught in the Fall of 2009, Alex Bilinski did a project on the chemistry of beer. He created a set of images with information that is either true or false in any context.

I just added these images of beer chemistry to the ChemTiles game, which I am actively using in my current Organic Chemistry I (CHEM241) course. Andrew Lang has made it very easy to add content to the game by simply uploading to a Flickr group. The category is determined by the Flickr tag.

I won't be testing my organic students on beer but they might find it fun to play that category. Alex wrote a fascinating report "Beer Flavor Compounds and Detection Methods" that can be used as a study guide for the quiz.

My students are currently competing on the topics of Lewis structures, hybridization, Newman projections and nomenclature. They simply need to sign in by entering "contest1" for the group name. The student in my class with the highest score for the contest1 group at the end of class (10:50 AM) on April 9, 2010 will win an organic chemistry textbook. As I have done in previous classes I'll run a few of these contests over the term with increasing amounts of course content.

It turns out that the ChemTiles game is very convenient to play on a Droid phone (and presumably on an iPhone although I have not checked that yet). For many students this might be a preferred way to review material before tests when on the move. I'll find out at the end of the term.

Labels: ,

Creative Commons Attribution Share-Alike 2.5 License