Guided exploration of data summaries
نویسندگان
چکیده
Data summarization is the process of producing interpretable and representative subsets an input dataset. It usually performed following a one-shot with purpose finding best summary. A useful summary contains k individually uniform sets that are collectively diverse to be representative. Uniformity addresses interpretability diversity representativity. Finding such as difficult task when data highly large. We examine applicability Exploratory Analysis (EDA) formalize Eda4Sum, problem guided exploration summaries seeks sequentially produce connected goal maximizing their cumulative utility. Eda4Sum generalizes summarization. propose solve it one two approaches: (i) Top1Sum chooses most at each step; (ii) RLSum trains policy Deep Reinforcement Learning rewards agent for new collection step. compare these approaches top-performing EDA solutions. run extensive experiments on three large datasets. Our results demonstrate superiority our summarizing very data, need provide guidance domain experts.
منابع مشابه
Photo-Guided Exploration of Volume Data Features
In this work, we pose the question of whether, by considering qualitative information such as a sample target image as input, one can produce a rendered image of scientific data that is similar to the target. The algorithm resulting from our research allows one to ask the question of whether features like those in the target image exists in a given dataset. In that way, our method is one of ima...
متن کاملDistortion-Guided Structure-Driven Interactive Exploration of High-Dimensional Data
Dimension reduction techniques are essential for feature selection and feature extraction of complex highdimensional data. These techniques, which construct low-dimensional representations of data, are typically geometrically motivated, computationally efficient and approximately preserve certain structural properties of the data. However, they are often used as black box solutions in data expl...
متن کاملRDF Digest: Ontology Exploration using Summaries
Ontology summarization aspires to produce an abridged version of the original ontology that highlights its most representative concepts. In this paper, we present RDF Digest, a novel platform that automatically produces and visualizes summaries of RDF/S Knowledge Bases (KBs). A summary is a valid RDFS document/graph that includes the most representative concepts of the schema, adapted to the co...
متن کاملAutoSummENG and MeMoG in Evaluating Guided Summaries
Within this article, we present the application of the AutoSummENG and MeMoG methods within the TAC 2011 AESOP challenge. Both evaluation methods are based on n-gram graphs. The experiments indicate that both methods offer very high performance in different aspects of evaluation, without the need of deep analysis or preprocessing. The results also imply some interesting open problems and point ...
متن کاملImage guided interactive volume visualization for confocal microscopy data exploration
3D microscopy visualization has the potential of playing a significant role in the study of 3D cellular structures in biomedical research. Such potential, however, has not been fully realized due to the difficulties of current visualization methods in coping with the unique nature of microscopy image volumes, such as low image contrast, noise and unknown transfer functions. In this paper, we pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2022
ISSN: ['2150-8097']
DOI: https://doi.org/10.14778/3538598.3538603