Processing SPARQL Queries Over Linked Data-A Distributed Graph-based Approach

نویسندگان

  • Peng Peng
  • Lei Zou
  • M. Tamer Özsu
  • Lei Chen
  • Dongyan Zhao
چکیده

We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a “partial evaluation and assembly” framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q over RDF graph G. Based on properties of subgraph matching over a distributed graph, we introduce local partial match as partial answers in each fragment of RDF graph G. For assembly, we propose two methods: centralized and distributed assembly. We analyze our algorithms from both theoretically and experimentally. Extensive experiments over both real and benchmark RDF repositories of billions of triples confirm that our method is superior to the state-of-the-art methods in both the system’s performance and scalability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LHD: Optimising Linked Data Query Processing Using Parallelisation

In the past few years as large volume of Linked Data has been published, and processing distributed SPARQL queries over the Linked Data cloud is becoming increasingly challenging. The high data traffic cost and response time significantly affect the performance of distributed SPARQL queries as the number of SPARQL end point and the volume of data at each endpoint increase. In this context, para...

متن کامل

A Hybrid Approach to Linked Data Query Processing with Time Constraints

In addition to RDF data within documents published according to the Linked Data principles, SPARQL endpoints are also a potential source of a great deal of Linked Data. The execution of queries using languages such as SPARQL can use utilise both of these types of data sources. In this paper we present a hybrid approach to answering SPARQL queries that makes use of both link traversal-based and ...

متن کامل

Optimizing SPARQL queries over the Web of Linked Data

The web of linked data represents a globally distributed dataspace. It can be queried with SPARQL whose execution takes place by asynchronously traversing the RDF links to discover data sources at run-time. However, the optimization of SPARQL queries over the web of data remains a challenge and in this paper we present an approach addressing this problem. The proposed approach works in two-phas...

متن کامل

Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1

We propose an efficient and scalable architecture for processing generalized graph-pattern queries as they are specified by the current W3C recommendation of the SPARQL 1.1 “Query Language” component. Specifically, the class of queries we consider consists of sets of SPARQL triple patterns with labeled property paths. From a relational perspective, this class resolves to conjunctive queries of ...

متن کامل

Best-effort Linked Data Query Processing with Time Constraints using ADERIS-Hybrid

Answering SPARQL queries over the Web of Linked Data is a challenging problem. Approaches based on distributed query processing provide up-to-date results but can suffer from delayed response times, indexing-based approaches provide fast response times but results can be out-of-date and the costs of indexing the growing Web of Linked Data are potentially huge. Hybrid approaches try to offer the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1411.6763  شماره 

صفحات  -

تاریخ انتشار 2014