Hash Kernels for Structured Data

نویسندگان

  • Qinfeng Shi
  • James Petterson
  • Gideon Dror
  • John Langford
  • Alexander J. Smola
  • S. V. N. Vishwanathan
چکیده

We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces. Moreover, we give deviation bounds from the exact kernel matrix. This has applications to estimation on strings and graphs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The submission is for the special topic on Large Scale Learning Hash Kernels for Structured Data

We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces. Moreover, we give deviation bounds from the exact kernel matrix. This has applications to estimation on strings and graphs.

متن کامل

Kernels for Structured Data

Learning from structured data is becoming increasingly important. However, most prior work on kernel methods has focused on learning from attribute-value data. Only recently have researchers started investigating kernels for structured data. This paper describes how kernel definitions can be simplified by identifying the structure of the data and how kernels can be defined on this structure. We...

متن کامل

Kernels for Semi-Structured Data

Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured data. In this paper, we discuss applications of kernel methods for semistructured data. We model semi-structured data by labeled ordered trees, and present kernels for classifying labeled ordered trees based on their ta...

متن کامل

Min-Hash Fingerprints for Graph Kernels: A Trade-off among Accuracy, Efficiency, and Compression

Graph databases that emerge from several relevant scenarios (e.g., social networks, the Web) require powerful data management algorithms and techniques. A fundamental operation in graph data management is computing the similarity between two graphs. However, due to the large scale and high dimensionality of real graph databases, computing graph similarity becomes a challenging problem in real s...

متن کامل

An Improved Hash Function Based on the Tillich-Zémor Hash Function

Using the idea behind the Tillich-Zémor hash function, we propose a new hash function. Our hash function is parallelizable and its collision resistance is implied by a hardness assumption on a mathematical problem. Also, it is secure against the known attacks. It is the most secure variant of the Tillich-Zémor hash function until now.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2009