Wednesday, April 29, 2009

Chemistry in Second Life April 09 Talks

Andrew Lang and I just did two presentation on applications of chemistry in Second Life.

The first was on April 24, 2009 at the "Virtual Worlds: Libraries, Education and Museums" (VW LEM) conference on Infotainment Island. The second was on April 29, 2009 at "Education Days" on Orange Island.

This was basically an updated version of the talk we gave at the ACS last summer. We showed how the ChemTiles and Spectral Games evolved from Second Life. That is interesting because usually Second Life applications are adaptations of projects initially conceived elsewhere.

As I mentioned previously, giving talks on Second Life or using some other form of tele-presence certainly has its advantages. It does not replace face to face interaction but I think people get a good idea of what we do and they can follow up later for more discussion or possibly even collaborations.

I enjoy presenting with Andy - we go back and forth depending on the content on the slide and when it is relevant Andy does a demo of how to rez a molecule right on stage. So far we have not had any technical problems with that and I think it drives home the message of how easy it is to use the orac rezzer.

The talk on Orange Island was recorded and I'll update a link to it here when available.

Tuesday, April 28, 2009

Is it becoming dangerous to NOT blog?

It wasn't so long ago that the big discussion about scientists blogging was whether or not it would hurt your career. Granted, some the examples used involved personal content that would have been problematic on any platform. Still, many scientists chose to blog anonymously, even for the most uncontroversial scientific musings.

Recently I have noticed a change in the tone. The question doesn't seem to be "Is blogging bad?" anymore but rather "Is blogging a waste of time?". Often this involves the rather ironic situation of naysayers using a blog to express their opinion that blogging is a waste of time. There are many examples of this but a particularly controversial discussion took place on Nature Network recently.

And then yesterday I came across a particularly good example of why blogging is not a waste of effort. I was checking my Sitemeter referring links and found a few from Nature Chemistry. Unfortunately the article is toll access but I was able to get my hands on a copy. It was Michelle Francl's article about the history of the periodic table and all the creative ways that people have used to demonstrate order in the elements.

Michelle used my blog post about Andrew Lang's 3D representation of the periodic table in Second Life as a reference for this type of table. This is a very short (4 sentence) post but it has the key elements of a good reference - answers to who? what? and where?. That is enough information to visit the exhibit and contact the creator for a copy.

Now Andy and I are witing this up as part of a larger article on chemistry in Second Life (see draft here). If that article had been completed and published, it is likely that Michelle would have used that as a primary reference. But it can take a really long time for the journal publication process to reach completion. If I had not blogged this I am sure Michelle could have adapted her article and found another similar reference.

The point is that mainstream scholarship (Nature Chemistry is certainly an example of that) is able and willing to use Web2.0 references when these are the most appropriate.

There are very few examples of mindblowingly original ideas. People working in related areas tend to come up with similar ideas. In a world where any of your competitors can blog their ideas as soon as they think of them, hoarding ideas might be the more dangerous choice.

It doesn't matter what you think about the professional status of blogs. It doesn't matter most scientists don't blog. The only thing that matters is that at least one of your competitors is willing to blog their research and that the traditional journals in your field are willing to accept blog posts (and other Web2.0 publication formats) as valid references.

Labels: , ,

Thursday, April 16, 2009

NASA Open Notebook Science Talk April 09

On April 15, 2009 I had an opportunity to give a talk at the NASA Goddard Space Flight Center. I talked about Open Notebook Science and all of the Web 2.0 tools that we use to operate. There were no chemists in the audience but hopefully the overall patterns of how all the components interconnected made enough sense to be useful.

I had a full hour so this talk is a pretty comprehensive summary of our projects, including the most recent work on the Spectral and ChemTiles games and the automated backing up of Google Spreadsheet documents and semi-automated solubility calculations using web services called from within Google Spreadsheets. All of this work was only possible because of Andy Lang's rapid development efforts. Tony Williams also assisted greatly with the Spectral Game.


We had a very nice conversation over lunch with a few NASA people. I found it interesting that many apparently very different user environments (librarians, educators, molecular biologists, cosmologists, etc.) share very similar needs for Web2.0 technologies. For example delicious was lauded as a very convenient alternative to email for sharing content. The distribution of personalities seems to be similar everywhere: a few early adopters within a larger more skeptical population.

After lunch Emma Antunes gave me a tour of the facilities. Despite the annoying rain to get between buildings, it was well worth it. Here are some of the cool things that I saw:

An enormous room housing very large speakers for testing the effect of vibrations on spacecraft and equipment. Emma stands next to one of the several speakers.

A huge centrifuge for testing the robustness of instruments. Emma said that they were able to put an SUV on there to how much force was required to tip it over.

I saw one of the satellites for the Solar Dynamics Observatory under construction. The idea of this project is to use the different perspectives from satellites at different positions in orbit around the sun to calculate the direction of solar flares and other potentially detrimental activity on the sun.

Next to the largest clean room in the world, there is a display of the guts of the Hubble telescope. Apparently the astronauts had to fix some components in there that were not designed to be accessible so they had to do a lot of practice on a duplicate before attempting the task in space.

Labels: , , , , ,

Sunday, April 12, 2009

Automatic Back-Up of ONS files: Google Spreadsheets, JCAMP-DX, Flickr

As many of you know, we have been heavily dependent upon publicly editable Google Spreadsheets for storing results and calculations relating to our Open Notebook Science projects. We have recently integrated automated processing of NMR files in JCAMP-DX format to calculate solubility data by using web services called directly from within the Spreadsheets.

That represents a lot of distributed technology that is susceptible to network or server problems. Andy Lang, who wrote the web services that currently calculate the solubility, has enabled the recall of previously calculated values via a quick database look-up. While this substantially reduces server load by avoiding lengthy calculations, it does mean that the final numbers do not exist in the Spreadsheets themselves.

In addition to these concerns, every time I give a talk to a group of librarians the issues of archiving and curation of new forms of scholarship are raised. These are valid concerns and I've been trying to work with several groups to deal with the problem in as automatic a way as possible.

We had initially considered a spidering service that would automatically follow every file linked to the ONS wikis and download the documents on a daily basis. This has turned out to be problematic because many of the links don't terminate directly on files, but rather user interfaces. For example, a typical link to a Google Spreadsheet does not lead to a simple HTML page that can be copied but rather to an interface to add data and set up calculations.

It turns out we can take a semi-automated solution that gets us to where we want to be but requires a bit more manual work. Google Spreadsheets can be exported as Excel spreadsheets, which store the results of web service calculations as simple values and include the link to the web service as a cell comment. All calculations within the Spreadsheet are also retained in this way. The trick is to "publish" the spreadsheet using the advanced option of exporting as an Excel file. This then becomes a simple URL.

Now, the only manual step left in the process is to copy these URLs to another BackUp Google Spreadsheet. Andy has created a little executable that steps through a list of these URLs and creates a backup on any Windows computer under a C:\ONS directory. It is simply then a question of setting up a Windows Scheduler service to run once a day and call the executable. All the files are named with the date as a first part of the name for easy sorting.

Besides Google Spreadsheets backed-up as Excel files, spectral JCAMP-DX files and Flickr images can be processed in the same way. In both these cases the user must specify the JDX or DX or JPG file directly. In Flickr you have to go through a few clicks to the download page for a given image but once you have that it works fine.

Andy has versioned this as V0.1 for good reason. It does do exactly what we want but there are a few caveats:

1) Any errors in specifying a file will abort the rest of the back-up. In future versions there would be tolerance for errors, with appropriate reporting of problems, perhaps by email.

2) Files don't necessarily have the correct extensions. For example, backed up Wikispaces pages have to be renamed with an HTML extension to be viewed in a browser. Note that Wikispaces has its own sophisticated back-up system that will put the entire wiki with all files directly uploaded onto the wiki into a single ZIP file - in either HTML or WIKITEXT format. Of course this will not include files residing outside - like Google Spreadsheets. Still I think there is no harm in including the wiki pages in the the list of files to be backed-up by Andy's system.

Going forward there are two types of collaborations that could help a lot:

1) Librarians who would be willing to archive UsefulChem and ONSChallenge files. Right now these are just a few Megs a day but this will increase as we continue to add to the list. To be reasonable about space I could see a protocol of keeping only one back-up per week or month for dates more than 30 days in the past. This is about what the Internet Archive does I think. It would certainly be unambigous to know for certain what was known at what time with multiple libraries maintaining archives.

2) Someone who knows how Google creates URLs for downloadable XLS exports would be mightly helpful. Similar for Flickr and JPG exports. Even just writing a script to spider all HTML pages linked to the wikis and blogs would save a lot of manual labor. The nice thing is that the results of the spidering code would just have to be dumped into the Back-Up Google Spreadsheet - which already backs itself up conveniently.

Labels: , , , ,

Tuesday, April 07, 2009

Is the Human Ego good for Science?

I have just finished reading the fascinating book "The Emperor of Scent" by Chandler Burr. It starts off describing the world of perfume with a focus on Luca Turin, a man with the unusual talent of being able to review perfumes with great eloquence, conjuring up beautiful metaphors of experiencing their scent.

The book then takes an unexpected turn into the description of Turin's theory about the mechanism of olfaction. There is some truly interesting science there, such as Turin's discovery of a binding site for NADPH and another for zinc on a protein thought to mediate smell. This supports his hypothesis that the protein functions as an electron tunneling spectroscope detecting differences in vibrational modes. Further evidence is provided by comparing the different smells of deuterated molecules like acetophenone and the similarity of the stench of boranes with thiols, which share similar IR spectral bands. This idea is at odds with the conventional view that molecular shape is responsible for the activity of odorants. (For a summary of Turin's theory I would suggest watching his recent TED talk "The Science of Scent" and his Wikipedia entry)

This is all very interesting stuff and would have made for a good read but what makes this a truly fascinating story is that Burr spends the rest of the book detailing the way the scientific community responded to his findings. As Turin waits a year to finally get rejected by Nature, the reviews, rebuttals and other communications with the editor are examined to expose the intense emotional components that can arise from the peer review process. The author even follows Turin to conferences and reports in detail how various members of the audience react and comment during his talk and at informal meetings over lunch.

People who are not in science may find this disturbing. All too often science operates like the judicial system, where winning can take on more importance than finding the truth.

The fundamental problem is conflict of interest. If you have patents or run a company it may not in your financial best interest to look under every rock, except as required by law. If you have built your career on a certain theory it may not be in your professional best interest to open every can of worms. Burr actually wrote a chapter explaining why the book appeared to be so one sided: it was hard to get detailed comments from Turin's detractors because, although they disagreed with his theory, they had not read his paper and did not have time to do so.

But Turin was really not that different in his conflict of interest related to ego. There are descriptions of him reading articles in a state of dread and delaying experiments for fear that he might be proved wrong. Still, I like to root for the underdog, so the book did have me hoping that he would be vindicated.

If most scientists are motivated by ego, is it possible to do egoless science - and what would that look like?

For starters I think that keeping a true Open Notebook (All Content shared Immediately) does a lot to keep your ego in check. If you report on what you find, when you find it, you don't have time to succumb to the temptation to cherry pick results and embellish the story of what happened.

Another trend that I think will emerge in the next few years and will change the way science gets done is machine-driven science. It will probably prove too much trouble to take into account a researcher's ego and career objectives when coding for AI to plan and analyze experiments to solve problems. Just like Turin, a lot of researchers (including myself, especially early in my career) procrastinate doing certain experiments for fear of not liking the outcome. The key again here is making the experimental logs of those machines public in real time.

When I refer to egoless science, I am speaking at the level of experimentation. I am driven by ego, like everyone else. But I have found it more useful to place its focus at the meta level. Instead of taking pride in appearing to run a perfect operation - and of getting high yields for our reactions - I am most pleased when the members of my group do their best to record exactly what happened, as they do science.

And being a strong proponent of Open Science, my ego is linked to those activities. Even though it is somewhat ironic, I do enjoy competing at being as openly collaborative as I can.

Labels: ,

Thursday, April 02, 2009

The ChemTiles Game

In another example of code and content re-mixing for educational purposes, Andrew Lang and I have adapted many of the elements of the Spectral Game and tiles from the Second Life quizzes I traditionally gave for my introductory organic chemistry course. Many of these tiles were originally created for use in Unreal Tournament.

The concept is simple: tiles represent images or statements that are either true or false in any context. By marking them as true or false and ensuring that one true tile is present in a mix of a number of false ones, games can be designed that vary from rooms within a maze to obelisks offering a selection of floating images in Second Life.

In the current implementation, the tiles appear in a web browser. Clicking on the correct tile produces a new random selection. Clicking on a false tile stops the game and records the player's score. Following the same structure as the most recent implementation of the Spectral Game, the first 10 queries present only two tiles. As the game progresses the difficulty level increases and more tiles are included.

Whereas the Spectral Game obtained spectra and molecules from ChemSpider, this game taps into a set of 256 x 256 pixel images in a Flickr group. Using Flickr lets us leverage the ability to easily tag images, which can then be used by players to select different topics to practice.

I'll be giving a prize to the student in my current CHEM241 class who scores highest by 10:50 ET April 10, 2009. The student must play under the "all tags" option, which covers all the material before test 1, given the following week. Students can also practice different modules by selecting tags like Lewis Structures, Hybridization, Nomenclature, Newman Projections, etc.

I think that this approach of rapid remixing of code and content on free hosted platforms (like Flickr or ChemSpider) is really the future of technology in education. It will be difficult for heavy top down - and expensive - systems to compete against the incredible flexibility of these lighweight and loosely connected initiatives led by educators with the simple motivation of just experimenting with teaching in a better way.

Labels: , ,

My Open Notebook Science talk at NASA in April 09

I will be speaking at the Goddard Space Flight Center at 11:00 on April 15, 2009. NASA has a long history of sharing data and it will be interesting to see how their experience compares with what we have been doing in chemistry. Certainly the much larger amount of telescope generated data presents different problems and opportunities.

Crowdsourcing solubility measurements using Open Notebook Science

Jean-Claude Bradley

The use of Open Notebook Science to collect and make publicly available non-aqueous solubility measurements will be described. This involves the real time sharing of all experiments and associated raw data by a community of collaborators who are geographically distributed and may have never communicated using channels other than this project. Monthly cash prizes are awarded to participating students by means of the ONS Challenge Submeta Awards. The laboratory notebook pages are recorded on a public wiki and the solubility measurements, including relevant calculations, are stored in public Google Spreadsheets. A combination of ChemSpider, the GoogleDoc visualization API and web services is used to enable flexible searching and display of desired subsets of the data.

Labels: , ,

Tim Bohinski is the April 09 Submeta ONS Award winner

Tim Bohinski, a Chemistry undergraduate student working under the supervision of Jean-Claude Bradley at Drexel University, is the April 2009 Submeta Open Notebook Science Challenge Award winner. He wins a cash prize from Submeta.

Tim mainly focused on evaporation techniques to measure solubility. He also wrote a review paper on non aqueous solubility. See his experiments here:

Five more Submeta ONS Awards will be made during 2009. Submissions from students in the US and the UK are still welcome.
For more information see:


Creative Commons Attribution Share-Alike 2.5 License