The GRID
Geographical data are rapidly becoming digital and are being made available in
large on-line archives on secured networks.
One of the purposes of this study
is to use large geographical datasets that are physically stored among remote
resources within a grid infrastructure and run distributed applications using
these data.
It also aims at providing participating institutions with complete
control on the selection of data they wish to process.
Our initial application
is a regional data grid for environmental research that will be eventually
extended to support grid-computing applications on a larger scale.
The data grid is expected to supply an abstraction layer between the classical
archives (end users are usually interested in requiring semantically
classified objects for their research), the data dissemination mechanism and
the computing tasks performed for the geo-process.
The SDSC Storage Resource
Broker (SRB) technology has been used to design the data-grid
infrastructures.
SRB is client-server middleware that provides a uniform interface for
connecting to heterogeneous data resources over a network and accessing
replicated data sets.
It allows the organization of data from heterogeneous
systems into easily accessible logical collections, and, in combination with
the Meta data Catalog, it supports location transparency by accessing data
objects through queries on their attributes rather than their physical
locations.
Computing resources can be also used and jobs submitted from the Web portal.
This capability is only granted to authorized users. Our GIS system is made of
geographically distributed datasets, a visualization tool and numerical
applications. So far a Linux cluster has been used to run the computing and the
geo-processing phases required by the GIS system: the CODESA3D hydrologic
solver and the environmental applications (described in the Application section)
process the geographic data, and produce new GIS information layers. Outputs are
then stored in the SRB servers. The data flow, the data storage, the way the
applications work have been designed in a non-conventional fashion to hide the
user the complexity of the infrastructure. In a holistic vision, this system
is more then a mere sum of modules: it is a GIS Enterprise to consume and
expose Web services for data mapping, querying and sharing, processing and
distributing. We are currently working to use the CyberSar Grid computing
resources which will enable our application to run in almost real time.
|
|