Discovering the topics of a data source: a statistical approach
نویسندگان
چکیده
In this paper, we present a preliminary approach for automatically discovering the topics of a structured data source with respect to a reference ontology. Our technique relies on a signature, i.e., a weighted graph that summarizes the content of a source. Graph-based approaches have been already used in the literature for similar purposes. In these proposals, the weights are typically assigned using traditional information-theoretical quantities such as entropy and mutual information. Here, we propose a novel data-driven technique based on composite likelihood to estimate the weights and other main features of the graphs, making the resulting approach less sensitive to overfitting. By means of a comparison of signatures, we can easily discover the topic of a target data source with respect to a reference ontology. This task is provided by a matching algorithm that retrieves the elements common to both the graphs. To illustrate our approach, we discuss a preliminary evaluation in the form of running example.
منابع مشابه
A review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملClassification and Comparison of Methods for Discovering Coverage Loss Areas in Wireless Sensor Networks
In recent years, wireless sensor networks data is taken into consideration as an ideal source, in terms of speed, accuracy and cost, in order to study the Earth's surface. One of the most important challenges in this area, is the signaling network coverage and finding holes. In recent years, wireless sensor networks data is taken into consideration as an ideal source, in terms of speed, accurac...
متن کاملA New Approach to Introducing Minimum Learning Requirements in Internal and Surgical Emergencies during General Medical Education
Introduction: In order to adjust medical students’ education with their professional needs, the educational managers in Isfahan Medical University decided to design a specific course for teaching Emergency Medicine. This study was done to determine the viewpoints of experts concerning minimum educational needs in emergency departments during general medical education. Methods: This cross-secti...
متن کاملDesigning Syllabus for Islamic Education Courses with a Health-Oriented Approach in Isfahan university of Medical Sciences
Introduction: Providing Islamic education courses with a health-oriented approach for medical students can be a source of dynamism and efficiency of these courses. The purpose of this study was to investigate the feasibility of presenting Islamic education courses with a health-oriented approach and to compile new topics for these courses. Methods: This research was conducted with a development...
متن کاملDiscovering Emerging Topics from WWW
Discovering emerging topics from WWW has been attracting attention of business professionals, especially marketing researchers. For this purpose, WWW can be a valuable source of information because it reflects the dynamics of human society. In this paper we aim at revealing the structure of WWW by using KeyGraph, a visualization method of hidden structure behind data, for understanding emerging...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014