نتایج جستجو برای: web wrapper generation
تعداد نتایج: 567401 فیلتر نتایج به سال:
We present solutions based on crowdsourcing platforms to support large-scale production of accurate wrappers around data-intensive websites. Our approach is based on supervised wrapper induction algorithms which demand the burden of generating the training data to the workers of a crowdsourcing platform. Workers are paid for answering simple membership queries chosen by the system. We present t...
System-on-chip is an integrated circuit comprising of numerous functional cores which can be of various types. Testing of such diverse circuit is very complex problem. Test access to digital cores is ensured by core wrapper architectures. The paper presents two novel contributions to core test wrappers: (1) the set of optimization techniques for parallel interface to provide faster test applica...
Information available on the Internet is made to be read by humans, not to be processed by machines. To automatically access this information, there is a need for intelligent services that convert HTML documents into more suitable formats like XML. This can be achieved through generation of Web wrappers, programs designed to process pages of a given Web site. To generate such Web wrappers, an e...
Web wrappers access databases hidden in the deep web by first interacting with web sites by, e.g., filling forms or clicking buttons, to extract the relevant data from the thus unearthed result pages. Though the (semi-)automatic induction and maintenance of such wrappers has been extensively studied, the efficient execution and optimization of wrappers has seen far less attention. We demonstrat...
A wealth of data on the World Wide Web is hidden behind web form query interfaces and cannot be found through regular search engines. Querying across multiple such sources is a tedious and error-prone process; it involves manually filling in many related, but different, web forms. SemaForm automates this process by correlating web form labels to entries in a domain ontology through the use of a...
The web is a rich resource of structured data. There has been an increasing interest in using web structured data for many applications such as data integration, web search and question answering. In this paper, we present DEXTER, a system to find product sites on the web, and detect and extract product specifications from them. Since product specifications exist in multiple product sites, our ...
Vulnerabilities in distributed applications are being uncovered and exploited faster than software engineers can patch the security holes. All too often these weaknesses result from implicit assumptions made by an application about its inputs. One approach to defending against their exploitation is to interpose a filter between the input source and the application that verifies that the applica...
Web index recommendation systems are designed to help internet users with suggestions for finding relevant information. One way to develop such systems is using the multi-instance learning (MIL) approach: a generalization of the traditional supervised learning where each example is a labeled bag that is composed of unlabeled instances, and the task is to predict the labels of unseen bags. This ...
The goal of information extraction from the Web is to provide an integrated view on data from autonomous heterogeneous information sources The main problem with current wrap per mediator approaches is that they rely on very di erent formalisms and tools for wrappers and mediators thus leading to an impedance mismatch between the wrapper and mediator level Additionally most approaches nowadays a...
Literature search and delivery in the World Wide Web is a rapidly expanding market. Up to now the search is mostly cost-free. But in the future we expect the appearance of more and more providers charging for their services. The main problems are finding the right provider and extracting the information. In this paper we present a system for intelligent information search and extraction from mu...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید