نتایج جستجو برای: web wrapper generation
تعداد نتایج: 567401 فیلتر نتایج به سال:
Digital libraries are more and more available on the web. However, retrieving information in these libraries is not easy because of sources heterogeneity and distribution. Thus, we propose the use of virtual integration with mediator-wrapper architecture. This architecture allows access to relational sources, XML documents and text files. The mediator schema is based on XML and Xquery as query ...
The deep Web presents a pressing need for integrating large numbers of dynamically evolving data sources. To be more automatic yet accurate in building an integration system, we observe two problems: First, across sequential tasks in integration, how can a wrapper (as an extraction task) consider the peer sources to facilitate the subsequent matching task? Second, across parallel sources, how c...
The process of extracting comparative heterogeneous web content data which are derived and historical from related web pages is still at its infancy and not developed. Discovering potentially useful and previously unknown information or knowledge from web contents such as “list all articles on ‘Sequential Pattern Mining’ written between 2007 and 2011 including title, authors, volume, abstract, ...
A crucial challenge for information extraction from the WWW is to generate wrappers, which are information extraction patterns or rules, which apply to numerous Web sites with great diversity in both format and content. Generating wrappers manually is tedious, time consuming and errorprone. Recent research has successfully adapted machine learning technology to generate wrappers for semi-struct...
Several techniques have been recently proposed to automatically generate web wrappers, i.e., programs that extract data from HTML pages, and transform them into a more structured format, typically in XML. These techniques automatically induce a wrapper from a set of sample pages that share a common HTML template. An open issue, however, is how to collect suitable classes of sample pages to feed...
This paper proposes the use of ontologies representing domain and linguistic knowledge for guiding natural language (NL) communication on the Web contents. This proposal deals with the problem of accessing and processing the Web data required to answer user consults. Concepts and communication acts are represented in the conceptual ontology (CO). Domain-restricted grammars and lexicons are obta...
The Web has become a major conduit to information repositories of all kinds. Today, more than 80% of information published on the Web is generated by underlying databases and this proportion keeps increasing. In some cases, database access is only granted through a Web gateway using forms as a query language and HTML as a display vehicle. In order to permit inter-operation (between Web sources ...
The rapid growth of the Internet and support for interoperability protocols has increased the number of Web accessible sources, WebSources. Current wrapper mediator architectures need to be extended with a Wrapper Cost Model (WCM) for WebSources that can estimate the response time (delays) to access sources as well as other relevant statistics. In this paper, we present a Web Prediction Tool (W...
In this paper, we discuss an architecture for integrating WWW applications that offer information and services in the same domain. At the center of this architecture exists a mediator, whose responsibilities are to interact with the user and to effectively exchange information with the underlying applications in order to accomplish the user’s task. The integration and interoperation of the exis...
The Eighth International Workshop on Knowledge Representation Meets Databases (KRDB) was held at the Ponti cia Universit a Urbaniana, in Rome, right after VLDB 2001. KRDB was initiated in 1994 to provide an opportunity for researchers and practitioners from the two areas to exchange ideas and results. This year's focus was on Modeling, Querying andManaging Semistructured Data. The one day progr...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید