Research




DLI Breaks the Semantic Barrier



Since its creation 30 years ago, information retrieval has been stuck at the level of word matching -- unable to provide semantic retrieval across subject areas.

The first crack in the semantic barrier was achieved recently by researchers on the NSF/DARPA/NASA Digital Library Initiative (DLI) project using large-scale simulations on NCSA's HP/Convex Exemplar SPP 1200.

A large-scale simulation of vocabulary switching was run at NCSA by DLI researchers Hsinchun Chen and Bruce Schatz on the Illinois DLI project. (Chen is on the faculty of the University of Arizona's Department of Management Information Systems; Schatz is a research scientist at NCSA and is on the faculty of UIUC's Graduate School of Library and Information Science.)

Using a week of dedicated computer time (and 10 days of CPU time overall), concept spaces were generated for 10,000,000 journal abstracts across 1,000 subject areas in engineering and science. This is one of the largest computations carried out on NCSA's Exemplar system, and it is the first step towards generic protocols for semantic retrieval and information analysis for the next wave of the Net [see access, Spring 1995].





Return to the Table of Contents.

NCSA: The National Center for Supercomputing Applications
access / Summer 1996 issue

Email comments to NCSA Publications Group: pubs@ncsa.uiuc.edu

Last Modified: July 17, 1996