Thursday, June 28, 2007

CombiUgi Web Service

Rajarshi's CombiUgi Web service is now available to generate up to 2 million SMILES of Ugi products.

Just dump the lists of the starting materials in SMILES format into the appropriate boxes, select clean SMILES and hit Generate Products.

It turns out that there was a little glitch in the results from the previous library of 68,000 compounds. All of the compounds required 2-naphthyl isocyanide. Although this compound is indeed in the Sigma-Aldrich catalogue, it is not in stock and will take 4-6 weeks to get here.

In trying to shorten the iterations of the science loop, we've removed that compound and only included those that were available for next-day shipping. We then added some more starting materials to give a 500,000 compound library.

Rajarshi will run this larger set through his algorithm predicting anti-tumor activity. If it turns out that none of the new compounds are predicted to be more active than the naphthyl derivatives, we'll have to place the order and wait. Otherwise, we will just move on.

Now things are getting interesting. This 500,000 compound library is open and available to anyone to run through their own filtering and ranking algorithms. For example, docking against malarial enoyl reductase would be useful to us but so could docking against any of the other proteins related to disease processes (e.g. as listed by Find-A-Drug).

Also, because Rajarshi's web service is open, anybody can create their own Ugi product libraries.

As I've said before, we would be willing to collaborate with bioinformatics groups and synthesize some of these products for any reasonably justifiable end, as long as everyone is willing to work openly.

Saturday, June 23, 2007

C&E News Second Life Article

Sarah Everts from Chemical and Engineering News has just published an article about chemistry activities in Second Life. Drexel Island got a mention:

My avatar was then deposited at a place in Second Life called Orientation Island. As I walked my avatar into a geodesic information dome, I happened to notice the "Fly" button. Intrigued, I wasted no time pressing it—and I shot up into the air, hitting the ceiling of the information dome like a clumsy goth-bird. It was around this time that Horace Moody, the avatar of a real-life chemist at Drexel University named Jean-Claude Bradley, came to the rescue and offered to teleport me to Drexel Island. Horace has been experimenting with Second Life as a way to teach undergraduate organic chemistry, a topic he says can definitely benefit from 3-D visualization. Several of his students have met on Drexel Island to challenge each other's organic know-how by touching an obelisk, which then flashes a sequence of quiz questions on Newman projections and Lewis dot structures.

I think that there are some terrific opportunities in Second Life for people with an interest in chemistry at all levels to explore and contribute. (see here for a recent example on molecular docking)

It is certainly a good way to meet curious and smart people.

InChIMatic, ChemSpider and UsefulChem

Rich Apodaca wrote about using his InChIMatic service to track molecules in UsefulChem.

Because we use InChIs in blog posts and HTML pages generated automatically from the molecules blog, doing an InChI search in Google is a pretty good way to find molecules of interest to UsefulChem. However, Rich makes the valid point that these pages do not always point to the experiments where they are used.

I was aware of the limitations of using a blog to track molecules when I set it up. Because we were limiting ourselves to a few hundred molecules, the blog served its purpose much as I expected it would.

But now, as we move to the manipulation of tens of thousands (and soon to millions) of molecules, we need to transition to a true database.

I've been working with Tony Williams to use ChemSpider for this purpose. UsefulChem has been a supplier in ChemSpider for several weeks and most of our molecules from the molecules blog have been indexed. In the next few days the first 68,000 molecules from the CombiUgi project should be incorporated as well.

This effectively moves the indexing and searching burden to a free hosted service that is designed to handle it. This is the same logic that I used when choosing Wikispaces to act as our group laboratory notebook.

Lets take a look at an example of how this can work.

Click on the Search button of ChemSpider then hit "Advanced". Under "Search by Data Source" select UsefulChem. Scroll to the top of the page and select "Search by Structure" then "Draw". Select "Substructure" then draw a furan ring

You should get about 10 hits.

Click on the 5-methylfurfurylamine to see its record in ChemSpider.

This record can be curated or annotated. I'm hoping we can use this interface to annotate with links to spectra, references, etc. But for now just click on its InChI and you'll get a Google search finding that molecule on UsefulChem blogs, Chemical Blogspace and an experiment page (EXP086) where its was used.

In order for that to work well, we need the InChIs to be generated for every molecule in every experiment. We've been putting the InChIs in the Tags section of each experiment page and it is now on the highest priority of our Experimental Format page to make sure that it gets done quickly.

Note that these InChI's could be scraped fairly easily from every UsefulChem experiment because of the standard format for specifying the experiment page.

The only issue left to really complete the process is an automated way to add new molecules to ChemSpider. Tony says that will be done soon.

Chemical Blogspace Tags


Labels: , ,

Wednesday, June 20, 2007

Nature Precedings Rocks

Following up on my initial comments, my first two posts in Nature Precedings have appeared.

Most people have been posting Powerpoint presentations so I started there with a recent presentation at the American Chemical Society about Open Notebook Science.

Open Notebook Science Using Blogs and Wikis (doi:10.1038/npre.2007.39.1)

Next, I posted an update on the CombiUgi project by basically combining two blog posts (one and two).

It took a lot longer to do this than I expected, experimenting with the format and trying to make it fairly self-contained. I ended up using Powerpoint, which I like for its modular nature and flexibility with image-rich materials. For example, it is easy to spin off as a SlideShare document (which I just noticed supports hyperlinks while embedded - nice!).

There are a few reasons I think Precedings will be one of the key breakthrough apps for Open Science.

1) Nature Publishing Group brings a serious amount of credibility to the table. That is going to make it much easier to convince people in mainstream scientific circles to contribute and read.

2) Flexibility of format: although files must currently be submitted as Powerpoint, Word or PDF file types, the organization of the information within these files is fairly open. The "article" format is not currently required. Although there is no peer review requirement, there is definitely editorial control (which I experienced as I was asked to rewrite my first abstract). They want to make sure that submissions are genuine scientific communications.

3) Referenceability: each accepted submission gets a DOI and clear citation instructions.

4) A convenient system for acknowledging collaborators as co-authors, including affiliation info.

5) Web 2.0 bells and whistles: tags, comments, RSS feeds, etc.

6) The price is right - free read/write.

7) Creative Commons License - Non-Commercial Use with Attribution.

What they do not yet accept are large data files but it looks like that is coming down the road.

Molecule Docking in Second Life

Update: Hiro posted a YouTube video of the docking - see end of post

A while back, I posted about how we have been experimenting with representing our research work on UsefulChem in Second Life. With the help of Andrew Lang (Hiro Sheridan on SL), we put up one of the molecules that we had been trying to make as an anti-malarial compound.

Hiro has now taken this to the next level and has the molecule actually moving into the binding pocket of the targeted enzyme (enoyl reductase) upon clicking on it. There are 4 hydrogen-bonding interactions between the molecules and the atoms involved are tagged in green.

I'm grateful that Hiro took the time to show the self-docking animation because it is really hard to manually connect these two 3D puzzle pieces in Second Life (give it a try! - slurl).

In order to get to this point required a considerable amount of collaboration and I would like to thank everyone involved: Goeff Hutchison, Keith Davies, Sean Gardner, Tsu-Soo Tan and Eloise Pasteur.

Wednesday, June 13, 2007

CombiUgi Says Order 2-Naphthyl Isocyanide

Two weeks ago I posted about the CombiUgi project, where I proposed that we make compounds from a combinatorial library predicted to have some potentially useful biological activity. The scientific blogosphere worked its magic and we now have a short list of compounds to make.

Rajarshi really worked hard on getting an algorithm to create the Ugi product SMILE codes and passed them through his tumor cell inhibition program. Out of about 68,000 he identified a shortlist of 21 that showed the most activity (see wiki for details). An example is shown below:

I find it very interesting that all the top hits involve 2-naphthyl isocyanide and over half involve boc-methionine. Is this real or even meaningful? We've been discussing these issues privately and I hope that Dan, Rajarshi and others continue the discussion openly.

In the meantime, we're ordering the chemicals and hopefully will have a few Ugi products to send to NCI for testing against their tumor cell lines.

The point of this excercise is not so much to prove that this model is correct or that we have found a new anti-tumor lead (though that would be nice) but that we can close the scientific loop of hypothesis-synthesis-assay in a completely open and collaborative scientific environment.

I welcome suggestions of other compounds from our virtual library that might be worth making (for any disease-related target), as long as we have assays that someone can run.

We are also working with Tony Williams to see if ChemSpider can serve as a database to store and manage the virtual library, the predicted properties and the assay results. Hopefully then we could increase the library to several million molecules.

Chemical Blogspace Tags

2-naphthyl isocyanide


Saturday, June 09, 2007

Nature Precedings

Egon has just posted about Nature Precedings, which looks like a no-brainer as an additional publication outlet for UsefulChem. I've requested an account and we'll see how it works.

In my view, producing knowledge in a Science 2.0 world is about communicating through redundancy, making it easy to prove who-knew-what-when. That is difficult to do with the traditional scientific publication system of giving away copyright. (Not impossible, because concepts and results can be rewritten using different words, but still difficult).

This should be interesting. Here is a description of Nature Precedings:

will enable researchers to share, discuss and cite their early findings. It provides a lightly moderated and relatively informal channel for scientists to disseminate information, especially recent experimental results and emerging conclusions. In this sense, it is designed to complement traditional peer-reviewed journals, allowing researchers to make informal communications such as conference papers or presentations more widely available and enabling them to be formally cited. This, in turn, allows them to solicit community feedback and establish priority over their results or ideas.

Tuesday, June 05, 2007

Thesis on Wiki Interest

There was quite a spike in our traffic to UsefulChem today.

The fact that Alicia's masters thesis "Synthesis of Diketopiperazines, Possible Malaria Enoyl Reducatase Inhibitors Using Open Source Science" is being written on a wiki was noted by Pharyngula, A Blog around the Clock and Pimm - Partial Immortalization.

I am particularly happy that Attila from Pimm has obtained permission from his supervisor to write at least part of his thesis on his blog. Outside of the sciences, I recall Mark Wagner doing something similar for his thesis on educational gaming. Also see Laura Blankenship's thesis on blogging in the classroom.

But there are some advantages to the wiki over the blog. Since every version of every page is archived, it is possible to view the evolution of the thesis over time. Also, as Alicia's supervisor, my comments are tracked - as well as her responses. One member of her thesis committee also agreed to provide feedback via the wiki. Not everything was captured here because we did have face to face meetings and used email but some important parts of the communication were tracked.

Another important feature of Alicia's thesis is that she linked back to experiment pages on UsefulChem, where all the raw data is housed. She was thus able to provide the ultimate citation to her own work as well as selected experiments from her co-workers to make a point, while being extremely clear about everyone's contribution.

For this reason, I think that wikis will be used increasingly as part of a scientist's porfolio to demonstrate special skills or writing ability. In our current system, all we know is that an individual was a co-author on a certain number of articles and we have to glean their contribution mainly from letters of recommendation. Wouldn't it have more impact for a potential employer to be shown the number and type of experiments done by an individual, the equipment they used and the observations they made?

Friday, June 01, 2007

Per Contra Article on Open Notebook Science

Bill Turner has just posted an article he wrote about Open Notebook Science in Per Contra. Here is an excerpt:

The Pursuit of Automation: Open Notebook Science. The Per Contra Interview with Jean-Claude Bradley

What do you get when you combine transparency and raw data? Jean-Claude Bradley says you should get automation of the scientific process, and his Useful Chemistry project is acting as a laboratory for his hypothesis. For instance, when he attempted to expose a particular product missing a methyl group to fifty percent TFA/CDCl3, it should have caused the furfuryl group to cleave. It didn’t.

You probably have no idea what that means. Neither do I. But the result is published for the world to see in the Useful Chemistry blog, available for other scientists to scrutinize and to help them avoid the same dead end. “We are attempting to do science in as transparent a manner as possible,” Bradley says. And that means publishing results— failures and all—online as the research unfolds.

Isn't it great that the word "furfuryl" would show up in a publication described as "The International Journal of the Arts, Literature and Ideas"? That's another sign that the Open Science movement is gaining importance in the mainstream.

Labels: , , ,

Creative Commons Attribution Share-Alike 2.5 License