Using NCSA computing resources, scientists help construct the National Virtual Observatory, a digital framework that will make astronomical data accessible for research on a scale never before achieved.

Imagine you're a researcher attempting to gather all the information ever published on a particular subject. You begin with the largest, most central information clearinghouse you know of: the Library of Congress and its catalogue of 120 million volumes. Wading through thousands of citations, abstracts, and book indices, most of which initially look promising but eventually turn out to be largely irrelevant, quickly becomes exhausting and overwhelming.

By the time you've finished, the Library of Congress has doubled in size. Suddenly there are now five or six libraries of comparable magnitude through which you'll also have to comb for information, as well as countless smaller, more obscure libraries and archives that you'll have to visit individually, because they, too, might have information important for your research.


This is the problem that confronts many astronomers working with archival data. Generally, once observing astronomers are finished looking for something specific in the data they have collected, they make it available publicly.But astronomical data have a long life. Data that have already served their original research purpose are still a rich source of new information for researchers asking other questions.

“We are just beginning to open up the multiwavelength universe,” says Robert Brunner, an assistant professor in the astronomy department at the University of Illinois at Urbana-Champaign and a research scientist at NCSA. “Data are increasing at an exponential rate.” In addition to ever-increasing optical and radio astronomical data repositories, he says, vast amounts of high-energy and infrared data are also accumulating. Searching archival material thoroughly and efficiently--whether you are looking for all existing information about a single object or an entire region of the sky--is fast becoming as daunting as attempting to search the whole known universe.

One way to access archival data is simply to search known repositories, such as the Sloan Digital Sky Survey, or warehouses such as the NASA Extragalactic Database. These archives, accessed via Web browsers, are likely to be frequently bookmarked by researchers. However, “certain kinds of questions are not easily answered by bookmarks,” says Ray Plante, a radio astronomer and research programmer at NCSA.

Often, says Plante, valuable information may be found in more obscure but equally useful catalogues or observatories, but the search process can be time-consuming. “Once you discover these sources, where do you find information about your question, and how do you get at that information efficiently? You may find a thousand different text links, but should you visit them all?”

Plante is helping to build the infrastructure for the National Virtual Observatory, a project that will incorporate the Sloan Digital Sky Survey and many other repositories, both famous and less well known. It will make astronomy research based on archival data a whole lot easier and more efficient by making use of Grid services and NCSA computing resources, particularly the new Linux clusters being installed as part of the TeraGrid. >>


Access Online | Posted 5-13-2003