Estimating the Maximum Expected Value through Gaussian Approximation

نویسندگان

  • Carlo D’Eramo
  • Alessandro Nuara
  • Marcello Restelli
چکیده

This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other stateof-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating Maximum Expected Value through Gaussian Approximation

Theorem 2. If we compare the expected value of DE reported in Equation (4) with the value of the estimator WE in Equation (3), we can notice strong similarities. The main difference is that in DE the sample mean of variable Xi and its probability of being the maximum are computed w.r.t. two independent set of samples, while in WE these two quantities are positively correlated. It follows that W...

متن کامل

Approximations to the Loglikelihood Function in the Nonlinear Mixed Effects Model

Nonlinear mixed effects models have received a great deal of attention in the statistical literature in recent years because of the flexibility they offer in handling unbalanced repeated measures data that arise in different areas of investigation, such as pharmacokinetics and economics. Several different methods for estimating the parameters in nonlinear mixed effects model have been proposed....

متن کامل

Width invariant approximation of fuzzy numbers

In this paper, we consider the width invariant trapezoidal and triangularapproximations of fuzzy numbers. The presented methods avoid the effortful computation of Karush-Kuhn-Tucker Theorem. Some properties of the new approximation methods are presented and the applicability of the methods is illustrated by examples. In addition, we show that the proposed approximations of fuzzy numbers preserv...

متن کامل

Expected Duration of Dynamic Markov PERT Networks

Abstract : In this paper , we apply the stochastic dynamic programming to approximate the mean project completion time in dynamic Markov PERT networks. It is assumed that the activity durations are independent random variables with exponential distributions, but some social and economical problems influence the mean of activity durations. It is also assumed that the social problems evolve in ac...

متن کامل

Density Estimation Through ConvexCombinations of Densities ; Approximation andEstimation

We consider the problem of estimating a density function from a sequence of independent and identically distributed observations x i taking value in R d. The estimation procedure constructs a convex mixture of`basis' densities and estimates the parameters using the maximum likelihood method. Viewing the error as a combination of two terms, the approximation error measuring the adequacy of the m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016