Unbiased Estimators for Entropy and Class Number

نویسندگان

  • STEPHEN MONTGOMERY-SMITH
  • THOMAS SCHÜRMANN
چکیده

We introduce unbiased estimators for the Shannon entropy and the class number, in the situation that we are able to take sequences of independent samples of arbitrary length. Introduction. This paper supposes that we may pick a sequence of arbitrary length of independent samples w1, w2, . . . from an infinite population. Each sample belongs to one of M classes C1, C2, . . . , CM , and the probability that a sample belongs to class Ci is pi. So these probabilities satisfy the constraints 0 ≤ pi ≤ 1 and ∑M i=1 pi = 1. The goal of this paper is to present methods to estimate the Shannon entropy H = − ∑M i=1 pi log(pi) (see [8]). An obvious method is to take a sample of size n, and compute the estimators p̂i = ki/n, where ki are the number of samples from the class Ci. However this is known to systematically underestimate the entropy, and it can be significantly biased [2][4][5][6]. Recently there have been more advanced estimators for the entropy which have smaller bias [2][3][4][5][6][7]. In this paper we introduce new entropy estimators that have bias identically zero. We also introduce an unbiased estimator for the class number M , a problem of interest to ecologists (see for example the review article [1]). The disadvantage of all our methods is that there is no a priori estimate of the sample size. For this reason, we postpone rigorous analysis of variance and other measures of confidence until it becomes clear that these estimators are of more than theoretical value. We will use the following power series. Define the harmonic number by hn = ∑n k=1 1/k, h0 = 0. Then for |x| < 1 1 (1− x)2 = ∞ ∑

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study on the Class of Chain Ratio–type Estimators

&nbsp;This paper considers the problem of estimating the population mean Ybar of the study variate Y using information on different parameters such as population mean $(bar{X})$, coefficient of variation $(C_x)$, kurtosis&nbsp; $beta_{2(x)}$, standard deviation $(S_x)$ of the auxiliary variate x and on the correlation coefficient, $rho$, between the study variate $Y$ and the auxiliary variate $...

متن کامل

Application of adaptive sampling in fishery part 2: Truncated adaptive cluster sampling designs

There are some experiences that researcher come across quite number of time for very large networks in the initial samples such that they cannot finish the sampling procedure. Two solutions have been proposed and used by marine biologists which we discuss in this article: i) Adaptive cluster sampling based on order statistics with a stopping rule, ii) Restricted adaptive cluster sampling. Until...

متن کامل

Application of adaptive sampling in fishery part 2: Truncated adaptive cluster sampling designs

There are some experiences that researcher come across quite number of time for very large networks in the initial samples such that they cannot finish the sampling procedure. Two solutions have been proposed and used by marine biologists which we discuss in this article: i) Adaptive cluster sampling based on order statistics with a stopping rule, ii) Restricted adaptive cluster sampling. Until...

متن کامل

Admissibility Estimation of Burr Type XI Distribution Under Entropy Loss Function Based on Record Values

The aim of this paper is to study the estimation of parameter of Burr Type XI distribution on the basis of lower record values. First, the minimum variance unbiased estimator and maximum likelihood estimator are obtained. Then the Bayes and empirical Bayes estimators of the unknown parameter are derived under entropy loss function. Finally, the admissibility and inadmissibility of a class of in...

متن کامل

Estimation of Lower Bounded Scale Parameter of Rescaled F-distribution under Entropy Loss Function

We consider the problem of estimating the scale parameter &beta of a rescaled F-distribution when &beta has a lower bounded constraint of the form &beta&gea, under the entropy loss function. An admissible minimax estimator of the scale parameter &beta, which is the pointwise limit of a sequence of Bayes estimators, is given. Also in the class of truncated linear estimators, the admissible estim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007