Collectively Representing Semi-Structured Data from the Web

نویسندگان

  • Bhavana Dalvi
  • William W. Cohen
  • James P. Callan
چکیده

In this paper, we propose a single lowdimensional representation of a large collection of table and hyponym data, and show that with a small number of primitive operations, this representation can be used effectively for many purposes. Specifically we consider queries like set expansion, class prediction etc. We evaluate our methods on publicly available semi-structured datasets from the Web.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Georeferencing Semi-Structured Place-Based Web Resources Using Machine Learning

In recent years, the shared content on the web has had significant growth. A great part of these information are publicly available in the form of semi-strunctured data. Moreover, a significant amount of these information are related to place. Such types of information refer to a location on the earth, however, they do not contain any explicit coordinates. In this research, we tried to georefer...

متن کامل

Mining Association Rules from Semi-Structured Data

Despite the growing popularity of semi-structured data such as Web documents, most knowledge discovery research has focused on databases containing well structured data. In this paper, we try to find useful information from semistructured data. In our approach, we begin by representing semi-structured data in a prototype-based approach. We then detect the most typical common structure of semist...

متن کامل

Semi-Structured Data Extraction from Heterogeneous Sources

This paper concerns the extraction of semi-structured data from Web pages generated from multiple on-line services. This task is addressed by representing the schemas for semi-structured data and crafting generic wrappers based on the schemas. We introduce a hybrid representation method for schemas of semi-structured data, consisting of a concept hierarchy and a set of knowledge unit frames. A ...

متن کامل

Enhanced Database Migration Technique Using XML

XML becomes a de facto standard for representing and exchanging data over the Web. It is designed to structure and carry data in a sensible way, thus helping programmers and web developers manipulate the data easily and efficiently. With the tremendous growth of XML data on the Internet, an efficient database system becomes necessary to maintain it. There are many Internet applications that pro...

متن کامل

FLEXIS – A FleXible Information System based on XML Data Model

Semi-structured data has gained a lot of prominence in the recent past, especially after the realization of the inadequacy of HTML for information representation on the WEB. Semi-structured data is characterized by lack of rigid structure (schema) and evolving structure. XML has been adopted as a practical model for representing semi-structured data and also as the standard for data exchange on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012