Tuesday, October 26, 2010

Elizabeth Brown's guest lecture for ChemInfo Retrieval

Elizabeth Brown from the Binghamton University Libraries presented on "Web 0.0/1.0/2.0/3.0 and Chemical Information" on October 21, 2010 as a guest lecturer for my fifth class on Chemical Information Retrieval this term.



Beth made an interesting analogy with art to illustrate the differences between these communication platforms. What stuck me during her presentation was the similarity between the current state of the semantic web (Web3.0) and the state of computerized searching in chemistry when I was a graduate student in the early 90s. I had the chance to do just one substructure search at the time and it had to be done through an expert librarian. The search had to be carefully planned because of the expense.

From what I recall, the perception at the time was that computerized searching was impractical and perhaps even unnecessary. After all "the way" to search for chemical information was to spend a weekend in the library systematically going through Chemical Abstracts books. This had "worked" for long time for the chemistry community and doing things differently was considered superfluous and even wasteful.

Today, "the way" to search for chemical information is to use expensive databases to perform a targeted search and extract the information from mainly toll access peer reviewed journals. If you are off the academic grid you have to rely on free information on the web which is rarely associated with a chain of provenance. As my students will attest, even with access to the best tools, it still takes a lot of time to find the information and compare it for consistency.

The unfamiliarity with computerized searching in the early 90s is now the common attitude towards the semantic web. This is understandable because its availability is limited and people don't understand what to do with the tools that are available. Web services that we provide (derived from other services from ChemSpider and other sources) are usable by anyone who can open up a Google Spreadsheet and copy and paste a URL but it will take time for people to understand how to incorporate these into their workflows.

I think that in 10 years the semantic web will simply be part of the infrastructure. Currently, when you type a word in a browser or word processing software the system knows enough to alert you to possible misspelling by underlining a word. Similarly, in the near future, typing or specifying in some way a chemical compound will automatically pull in all relevant measured or calculated properties and provide suggestions in the context of a chemical reaction under consideration. Access to information will be free and unencumbered and the chain of provenance will be clear.

Instead of taking hours to find and process property data for single compounds, I think students in the future will be handling large libraries of compounds, looking for the best synthetic targets for their applications.

No comments:

Post a Comment