نتایج جستجو برای: web wrapper generation

تعداد نتایج: 567401  

1998
Chun-Nan Hsu

| Integrating a large number of Web information sources may signiicantly increase the utility of the WorldWide Web. A promising solution to the integration is through the use of a Web Information mediator that provides seamless, transparent access for the clients. Information mediators need wrappers to access a Web source as a structured database, but building wrappers by hand is impractical. P...

1999
Arnaud Sahuguet Fabien Azavant

In this paper, we present the W4F toolkit for the generation of wrappers for Web sources. W4F consists of a retrieval language to identify Web sources, a declarative extraction language (the HTML Extraction Language) to express robust extraction rules and a mapping interface to export the extracted information into some userde ned data-structures. To assist the user and make the creation of wra...

2004
Sven Meyer Benno Stein

The automatic processing of search results that stem from Web-based search interfaces has come into focus, and it will remain important (as long as XML is not a universally applied technology). The reasons for this are twofold: (1) The need for value-added services such as filtering or graphical preparation of search results will increase. (2) The manual creation of tailored parsers for the inf...

2002
Michael Christoffel Bethina Schmitt Jürgen Schneider

The success of the Internet as a medium for the supply and commerce of various kinds of goods and services leads to a fast growing number of autonomous and heterogeneous providers that offer and sell goods and services electronically. The new market structures have already entered all kinds of markets. Approaches for market infrastructures usually try to cope with the heterogeneity of the provi...

2015
Ákos Hajnal Tamás Kifor Gergely Lukácsy

More and more systems provide data through web service interfaces and these data have to be integrated with the legacy relational databases of the enterprise. The integration is usually done with enterprise information integration systems which provide a uniform query language to all information sources, therefore the XML data sources of Web services having a procedural access interface have to...

2006
Christian Schindler Pranjal Arya Andreas Rath Wolfgang Slany

The htmlButler project aims at enhancing the usability of visual wrapper technology while preserving versatility. htmlButler will allow, for an untrained user who has only the most basic web knowledge, to visually specify simple but useful wrappers and, for a more tech-savvy user, to visually or otherwise specify more complex wrappers. htmlButler was started 2005/2 and is based on visual wrappi...

2008
David Camacho Maria D. R-Moreno David F. Barrero Rajendra Akerkar

In this paper, we propose an approach to extract information from HTML pages and to add semantic (XML) tags to them. Wrapping is an essential technique used to automatically extract information from Web sources. This paper describes both, a general approach based on rules, which can be used to automatically generate wrappers, and an assistant generator wrapper called WebMantic. We also provide ...

2013
Oliver Jundt Maurice van Keulen

Web information extraction typically relies on a wrapper, i.e., program code or a configuration that specifies how to extract some information from web pages at a specific website. Manually creating and maintaining wrappers is a cumbersome and error-prone task. It may even be prohibitive as some applications require information extraction from previously unseen websites. This paper targets auto...

2005

In this paper, we describe techniques for learning wrappers efficiently using very few user-supplied labels (typically, 1 or 2 labels, all within a single page). This is an improvement over previous work, which require multiple labeled examples on multiple pages. In effect, it brings the power of the wrapper down to the level of the end-user, who can teach, by only a few demonstrations, the lab...

2006
Juan Raposo Manuel Álvarez José Losada Alberto Pan

A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wrapper programs are built to provide a “machinereadable” view over them. A significant problem with wrappers is that, since web sources are autonomous, they may experience changes that invalidate the current wrapper, so aut...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید