Catoni-style confidence sequences for heavy-tailed mean estimation

نویسندگان

چکیده

A confidence sequence (CS) is a of intervals that valid at arbitrary data-dependent stopping times. These are useful in applications like A/B testing, multi-armed bandits, off-policy evaluation, election auditing, etc. We present three approaches to constructing for the population mean, under minimal assumption only an upper bound $\sigma^2$ on variance known. While previous works rely light-tail assumptions boundedness or subGaussianity (under which all moments distribution exist), sequences our work able handle data from wide range heavy-tailed distributions. The best among methods -- Catoni-style performs remarkably well practice, essentially matching state-of-the-art $\sigma^2$-subGaussian data, and provably attains $\sqrt{\log \log t/t}$ lower due law iterated logarithm. Our findings have important implications sequential experimentation with unbounded observations, since $\sigma^2$-bounded-variance more realistic easier verify than $\sigma^2$-subGaussianity (which implies former). also extend infinite variance, but having $p$-th central moment ($1<p<2$).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of confidence intervals for the mean of heavy tailed loss distributions: a comparative study using a simulation method

This paper uses nonparametric methods to estimate the confidence intervals for the mean of asymmetric heavy tailed loss distributions. The nonparametric methods employed are the m out of n bootstrap, subsampling bootstrap, refined bootstrap, empirical likelihood ratio method, and bootstrap calibrated empirical likelihood methods. We evaluate the accuracy and compare the performance of the confi...

متن کامل

Empirical-likelihood-based Confidence Interval for the Mean with a Heavy-tailed Distribution

Empirical-likelihood-based confidence intervals for a mean were introduced by Owen [Biometrika 75 (1988) 237–249], where at least a finite second moment is required. This excludes some important distributions, for example, those in the domain of attraction of a stable law with index between 1 and 2. In this article we use a method similar to Qin and Wong [Scand. J. Statist. 23 (1996) 209–219] t...

متن کامل

High confidence estimates of the mean of heavy-tailed real random variables

We present new estimators of the mean of a real valued random variable, based on PAC-Bayesian iterative truncation. We analyze the non-asymptotic minimax properties of the deviations of estimators for distributions having either a bounded variance or a bounded kurtosis. It turns out that these minimax deviations are of the same order as the deviations of the empirical mean estimator of a Gaussi...

متن کامل

Confidence Regions for High Quantiles of a Heavy Tailed Distribution

Estimating high quantiles plays an important role in the context of risk management. This involves extrapolation of an unknown distribution function. In this paper we propose three methods, namely, the normal approximation method, the likelihood ratio method and the data tilting method, to construct confidence regions for high quantiles of a heavy tailed distribution. A simulation study prefers...

متن کامل

Inference for the mean in the heavy-tailed case

In this article, asymptotic inference for the mean of i.i.d. observations in the context of heavy-tailed distributions is discussed. While both the standard asymptotic method based on the normal approximation and Efron's bootstrap are inconsistent when the underlying distribution does not possess a second moment, we propose two approaches based on the subsampling idea of Politis and Romano (199...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Stochastic Processes and their Applications

سال: 2023

ISSN: ['1879-209X', '0304-4149']

DOI: https://doi.org/10.1016/j.spa.2023.05.007