UPSP: Unique Predicate-based Source Selection for SPARQL Endpoint Federation

نویسندگان

  • Ethem Cem Ozkan
  • Muhammad Saleem
  • Erdogan Dogdu
  • Axel-Cyrille Ngonga Ngomo
چکیده

Efficient source selection is one of the most important optimization steps in federated SPARQL query processing as it leads to more efficient query execution plan generation. An over-estimation of the data sources will generate extra network traffic by retrieving irrelevant intermediate results. Such intermediate results will be excluded after performing joins between triple patterns. Consequently an over-estimation of sources may result in increased query execution time. Devising triple patterns join-aware source selection approaches has shown to yield great improvement potential. In this work, we present UPSP, a new source selection approach for SPARQL query federation over multiple SPARQL endpoints. UPSP makes use of the subject-subject, subject-object, object-subject, and object-object joins information stored in an index structure to perform efficient triple patterns join-aware source selection. Our evaluation results on FedBench shows that UPSP outperforms state-of-the-art source selection approaches by selecting smaller number of sources (without losing recall) and reducing the query

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation

Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergr...

متن کامل

How Good Is Your SPARQL Endpoint? - A QoS-Aware SPARQL Endpoint Monitoring and Data Source Selection Mechanism for Federated SPARQL Queries

Due to the decentralised and autonomous architecture of the Web of Data, data replication and local deployment of SPARQL endpoints is inevitable. Nowadays, it is common to have multiple copies of the same dataset accessible by various SPARQL endpoints, thus leading to the problem of selecting optimal data source for a user query based on data properties and requirements of the user or the appli...

متن کامل

A fine-grained evaluation of SPARQL endpoint federation systems

The Web of Data has grown enormously over the last years. Currently, it comprises a large compendium of interlinked and distributed datasets from multiple domains. The abundance of datasets has motivated considerable work for developing SPARQL query federation systems, the dedicated means to access data distributed over the Web of Data. However, the granularity of previous evaluations of such s...

متن کامل

On Metrics for Measuring Fragmentation of Federation over SPARQL Endpoints

Processing a federated query in Linked Data is challenging because it needs to consider the number of sources, the source locations as well as heterogeneous system such as hardware, software and data structure and distribution. In this work, we investigate the relationship between the data distribution and the communication cost in a federated SPARQL query framework. We introduce the spreading ...

متن کامل

Federated SPARQL Query Processing Via CostFed

Efficient source selection and optimized query plan generation belong to the most important optimization steps in federated query processing. This paper presents a demo of CostFed, an index-assisted federation engine for federated SPARQL query processing. CostFed’s source selection and query planning is based on the index generated from the SPARQL endpoints. The key innovation behind CostFed is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016