Maxdiff kd-trees for data condensation

نویسندگان

  • B. Lakshmi Narayan
  • C. A. Murthy
  • Sankar K. Pal
چکیده

Prototype selection on the basis of conventional clustering algorithms results in good representation but is extremely time-taking on large data sets. kd-trees, on the other hand, are exceptionally efficient in terms of time and space requirements for large data sets, but fail to produce a reasonable representation in certain situations. We propose a new algorithm with speed comparable to the present kd-tree based algorithms which overcomes the problems related to the representation for high condensation ratios. It uses the Maxdiff criterion to separate out distant clusters in the initial stages before splitting them any further thus improving on the representation. The splits being axis-parallel, more nodes would be required for the representing a data set which has no regions where the points are well separated. 2005 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Randomly Projected KD-Trees with Distance Metric Learning for Image Retrieval

Efficient nearest neighbor (NN) search techniques for highdimensional data are crucial to content-based image retrieval (CBIR). Traditional data structures (e.g., kd-tree) usually are only efficient for low dimensional data, but often perform no better than a simple exhaustive linear search when the number of dimensions is large enough. Recently, approximate NN search techniques have been propo...

متن کامل

An improvement in the build algorithm for Kd-trees using mathematical mean

Querying forms a central part of dealing with data. Hence, it becomes imperative to have efficient data structures and query-search algorithms for data retrieval. Among the different types of data structures, in this paper, we have focused on 4 tree data structures. These are B, Kd, Range and Quad trees. We have made a comparative study of their build processes. Traditionally, the median of dat...

متن کامل

Accelerated and Extended Building of Implicit kd-Trees for Volume Ray Tracing

Implicit kd-trees have proven to greatly accelerate iso-surface rendering by ray tracing, such that interactive performance can already be achieved on a dual processor machine. However, the kd-tree could not be used for semi-transparent rendering and building the kd-tree was still a slow process that did not allow for rendering time-varying data sets. In this paper we extend the kd-tree to prov...

متن کامل

High-dimensional Proximity Joins

Many emerging data mining applications require a proximity (similarity) join between points in a high-dimensional domain. We present a new algorithm that utilizes a new data structure, called the -kd tree, for fast spatial proximity joins on high-dimensional points. This data structure reduces the number of neighboring leaf nodes that are considered for the join test, as well as the traversal c...

متن کامل

Kinetic Medians and kd-Trees

We propose algorithms for maintaining two variants of kd-trees of a set of moving points in the plane. A pseudo kd-tree allows the number of points stored in the two children to differ. An overlapping kd-tree allows the bounding boxes of two children to overlap. We show that both of them support range search operations in O(n) time, where 2 only depends on the approximation precision. When the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2006