Accounting for Boundary E ects in Nearest NeighborSearching 1

نویسندگان

  • Sunil Arya
  • David M. Mount
  • Onuttom Narayan
چکیده

Given n data points in d-dimensional space, nearest neighbor searching involves determining the nearest of these data points to a given query point. Most average-case analyses of nearest neighbor searching algorithms are made under the simplifying assumption that d is xed and that n is so large relative to d that boundary eeects can be ignored. This means that for any query point the statistical distribution of the data points surrounding it is independent of the location of the query point. However, in many applications of nearest neighbor searching (such as data compression by vector quantization) this assumption is not met, since the number of data points n grows roughly as 2 d. Largely for this reason, the actual performances of many nearest neighbor algorithms tend to be much better than their theoretical analyses would suggest. We present evidence of why this is the case. We provide an accurate analysis of the number of cells visited in nearest neighbor searching by the bucketing and k-d tree algorithms. We assume m d points uniformly distributed in dimension d, where m is a xed integer 2. Further, we assume that distances are measured in the L 1 metric. Our analysis is tight in the limit as d approaches innnity. Empirical evidence is presented showing that the analysis applies even in low dimensions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic E ects of Information Disclosure on Investment E ciency

This paper studies how information disclosure a ects investment e ciency and investor welfare in a dynamic setting in which a rm makes sequential investments to adjust its capital stock over time. We show that the e ects of accounting disclosures on investment e ciency and investor welfare crucially depend on whether such disclosures convey information about the rm's future capital stock or abo...

متن کامل

The design and eects of control systems: tests of direct- and indirect-eects models

Two models are developed on the e€ects of a control system that include participative standard setting, standardbased incentives, and standard tightness. The direct model proposes that the control system directly a€ects performance, whereas the indirect model proposes that the e€ects of the control system on performance are indirect through the mediating in ̄uence of job-related stress. Hypothes...

متن کامل

Econometric Accounting of the Australian Corporate Tax Rates: a Firm Panel Example

The paper presents an econometric accounting of the e¤ective corporate tax rate in Australia for the years 1993 to 1996. The estimation is a panel of Australian …rms that uses a specially gathered …nancial data base. Using …xed and random e¤ects, the model speci…es that the statutory tax rate is estimated as the constant term of the model. An ability to …nd an estimated statutory tax rate that ...

متن کامل

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

 Edge Detection is an important task for sharpening the boundary of images to detect the region of interest. This paper applies a linear cellular automata rules and a Mamdani Fuzzy inference model for edge detection in both monochromatic and the RGB images. In the uniform cellular automata a transition matrix has been developed for edge detection. The Results have been compared to the ...

متن کامل

Edge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System

 Edge Detection is an important task for sharpening the boundary of images to detect the region of interest. This paper applies a linear cellular automata rules and a Mamdani Fuzzy inference model for edge detection in both monochromatic and the RGB images. In the uniform cellular automata a transition matrix has been developed for edge detection. The Results have been compared to the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995