|
This summer researchers from the
Institute for Biomedical
Computing at Washington University in St. Louis selected
Biology Workbench to integrate the dozens of genome-
related tools and massive databases emanating from the
Human Genome Project. The
institute which is developing tools that
automatically annotate the genome sequence and test
its accuracy, is teamed with Shankar Subramaniam's
Computational Biology Group at NCSA to expand and
customize the Biology Workbench's capabilities for genome
research. Although the institute is only one of the 12
U.S. research institutions working on the Human Genome
Project, the institute's director, David States,
believes the Biology Workbench(TM) will prove so useful
that all the other institutions will adopt it.
"Biology Workbench gives us a way to access these tools and data in a uniform environment and to disseminate them to others," says States, who is also associate professor of Biomedical Computing at Washington University in St. Louis. "It provides the biology community with an essential tool for beginning to look at the genome data and for understanding it." The Human Genome Project is a massive multiagency undertaking begun in 1987 by the Department of Energy and the National Institutes of Health (NIH) to map the 50,000 to 100,000 genes that constitute the human genome. The genome is the complete set of instructions for making and maintaining an organism. In humans, this master blueprint is organized into 24 chromosomes located in the nucleus of each of the body's trillions of cells. The initial goal of the Human Genome Project was to map the location of genes along these chromosomes. Now scientists are beginning to sequence all 3 billion or so base pairs of nucleotides that constitute human genes. Knowing this sequence is expected to make it easier for scientists to understand genetic causes of diseases and to develop treatments. But first scientists need to decode the sequence. The information spilling out of DNA sequencers amounts to an unannotated stream of data. Like a message written in an unknown language, the sequence offers few clues about which proteins a particular gene codes for, and whether or not a gene actually codes for a protein. Some genes act like punctuation in a sentence to mark the beginning and end of a sequence. Others are duplicates or broken and unusable genes. Still others have functions that are unknown. The Biology Workbench will help tease out the 10 percent of genes used by a cell to build proteins; then it will help scientists predict protein shapes and functions. Scientists also need an easy means of accessing these data, something which Biology Workbench's uniform computing environment can provide. Some experts predict that by the project's end in 2002 there will be enough data to fill 200 volumes, each the size of the Manhattan phone book. A less direct benefit from the workbench's inclusion in the Human Genome Project is its potential to expose more scientists to this kind of Web-based technology. "If we can make a dent in this enormous amount of data, we can show the technology's value to every science," says Subramaniam. "It can be a paradigm for many disciplines." |