Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation
نویسندگان
چکیده
Estimating the cardinality (i.e., the number of answers) of conjunctive queries is particularly difficult in RDF systems: queries over RDF data are navigational and thus tend to involve many joins. We present a new, principled cardinality estimation technique based on graph summarisation. We interpret a summary of an RDF graph using a possible world semantics and formalise the estimation problem as computing the expected cardinality over all RDF graphs represented by the summary, and we present a closed-form formula for computing the expectation of arbitrary queries. We also discuss approaches to RDF graph summarisation. Finally, we show empirically that our cardinality technique is more accurate and more consistent, often by orders of magnitude, than the state of the art. ACM Reference Format: Giorgio Stefanoni, Boris Motik, and Egor V. Kostylev. 2018. Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation. In Proceedings of The 2018 Web Conference (WWW 2018). ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3178876.3186003
منابع مشابه
Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1
We propose an efficient and scalable architecture for processing generalized graph-pattern queries as they are specified by the current W3C recommendation of the SPARQL 1.1 “Query Language” component. Specifically, the class of queries we consider consists of sets of SPARQL triple patterns with labeled property paths. From a relational perspective, this class resolves to conjunctive queries of ...
متن کاملGraph summaries for optimizing graph pattern queries on RDF databases
The adoption of the Resource Description Framework (RDF) as a metadata and semantic data representation standard is spurring the development of high-level mechanisms for storing and querying RDF data. A common approach for managing and querying RDF data is to build on Relational/Object Relational Database systems and translate queries in an RDF query language into queries in the native language...
متن کاملConcept Lattices of RDF Graphs
The concept lattice of an RDF graph is defined. The intents are described by graph patterns rather than sets of attributes, a view that is supported by the fact that RDF data is essentially a graph. A simple formalization by triple graphs defines pattern closures as connected components of graph products. The patterns correspond to conjunctive queries, generalization of properties is supported....
متن کاملPublish/Subscribe with RDF Data over Large Structured Overlay Networks
We study the problem of evaluating RDF queries over structured overlay networks. We consider the publish/subscribe scenario where nodes subscribe with long-standing queries and receive notifications whenever triples matching their queries are inserted in the network. In this paper we focus on conjunctive multi-predicate queries. We demonstrate that these queries are useful in various modern app...
متن کاملQuery Driven Hypothesis Generation for Answering Queries over NLP Graphs
It has become common to use RDF to store the results of Natural Language Processing (NLP) as a graph of the entities mentioned in the text with the relationships mentioned in the text as links between them. These NLP graphs can be measured with Precision and Recall against a ground truth graph representing what the documents actually say. When asking conjunctive queries on NLP graphs, the Recal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.09619 شماره
صفحات -
تاریخ انتشار 2017