Grid Project Provides Robust Access Protocol for RDF(S) Data Resources
Researchers have created an access protocol for data resources, or information sources, that are defined in RDF(S). RDF(S) is a set of web of data recommendations promoted by the W3C consortium for defining metadata (data about data) on the Web. One of its promoters is Tim Berners-Lee, the father of the Internet. RDF(S) is actually a combination of the RDF language, used to define metadata, and the RDF Schema, an extension of the RDF language used to define vocabularies, which then structure the metadata defined using RDF.
The use of RDF(S) has now gone beyond the Web and has started to be used in other contexts. It is being used to build ontologies for representing genetic information, for publishing some of the United States or United Kingdom Governments’ economic recovery program databases or even to represent information on Wikipedia or in the New York Times. The key difference between RDF(S) and other computer languages is that it has been created specifically for the Internet environment and for machine consumption. This way, data accumulated on the network of networks defined in RDF(S) can be processed automatically by computer applications in conformity with user needs, ultimately improving the use of such a vast amount of information.
The access protocol to data resources defined using RDF(S), developed at the Universidad Politécnica de Madrid’s (UPM) School of Computing, together with teams from Japan (National Institute of Advanced Industrial Science and Technology, AIST) and the United Kingdom (National e-Science Centre), combines Grid technology, enabling the shared and coordinated use of geographically distributed computational resources, and the web of data, which is empowered through Grid specifications and technologies to exploit RDF(S) data resources.
The key advantage of this project is that it offers more robust access to systems making intensive use of this type of resource, as it abstracts the end user away from the technical details of the resource implementation, while providing a mechanism for fine-grained use of their contents. The result of this research work is a much more powerful and securer tool for exploiting RDF(S) data resources. Systems developers using this type of resource will find this tool useful, as it provides mechanisms not only for querying, but also for modifying the actual RDF(S) data resources.
The Open Grid Forum’s Database Access and Integration Services Working Group started work on creating this protocol in 2006. Since then, the working group has output a motivational document and two RDF(S) Grid access specifications, one using a declarative mechanism through the SPARQL query language and another using a conceptual programming mechanism. The first specification is led by AIST and the second by the UPM’s School of Computing.
The protocol is now considered to be complete, although it is under ongoing review by a number of teams of experts. The protocol is already being used in several projects partnered by the UPM, like UPGrid or ADMIRE, where middleware technologies are being created to improve data mining and integration tasks from scientific and commercial sources.