Computational rank-based statistics

نویسندگان

Joseph W. McKean

Jeff T. Terpstra

John D. Kloke

چکیده

This review discusses two algorithms which can be used to compute rank-based regression estimates. For completeness, a brief overview of rank-based inference procedures in the context of a linear model is presented. The discussion includes geometry, estimation, inference, and diagnostics. In regard to computing the rank-based estimates, we discuss two approaches. The first approach is based on an algebraic identity which allows one to compute the (Wilcoxon) estimates using a regression routine. The other approach is a Newton type algorithm. In addition, we discuss how rank-based inference can be generalized to nonlinear and random effects models. Some simple examples using existing statistical software are also presented for the sake of illustration and comparison. Traditional least squares (LS) procedures offer the user an encompassing methodology for analyzing models, linear or non-linear. These procedures are based on the simple premise of fitting the model by minimizing the Euclidean distance between the vector of responses and the model. Besides the fit, the LS procedures include diagnostics to check the quality of fit and an array of inference procedures including confidence intervals (regions) and tests of hypotheses. LS procedures, though, are not robust. One outlier can spoil the LS fit, its associated inference and even its diagnostic procedures (i.e. methods which should detect the outliers). Rank-based procedures also offer the user a complete methodology. The only essential change is to replace the Euclidean norm by another norm, so that the geometry remains the same. As with the LS procedures, these rank-based procedures offer the user diagnostic tools to check the quality of fit and associated inference procedures. Further, in contrast to the LS procedures, they are robust to the effect of outliers. They are generalizations of simple nonparametric rank procedures such as the Wilcoxon one and two-sample methods and they retain the high efficiency of these simple rank methods. Further, depending on the knowledge of the underlying error distribution, this rank-based analysis can be optimized by the choice of the norm (scores). Weighted versions of the fit can obtain high (50%) breakdown.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bootstrap and fast double bootstrap tests of cointegration rank with financial time series

The likelihood ratio test of cointegration rank is the most widely used test for cointegration. Many studies have shown by simulation that the small sample distribution is not well approximated by the limiting distribution. We suggest using the bootstrap to generate small sample critical values instead of correcting the test statistics. The idea of bootstrapping the trace test of cointegration ...

متن کامل

Low rank estimation of higher order statistics

| Low rank estimators for higher order statistics are considered in this paper. Rank reduction methods ooer a general principle for trading estimator bias for reduced estimator variance. The bias-variance tradeoo is analyzed for low rank estimators of higher order statistics using a tensor product formulation for the moments and cumulants. In general the low rank estimators have a larger bias a...

متن کامل

On the exact distribution of maximally selected rank statistics

The construction of simple classi1cation rules is a frequent problem in medical research. Maximally selected rank statistics allow the evaluation of cutpoints, which provide the classi1cation of observations into two groups by a continuous or ordinal predictor variable. The computation of the exact distribution of a maximally selected rank statistic is discussed and a new lower bound of the dis...

متن کامل

Rank tests and regression rank score tests in measurement error models

The rank and regression rank score tests of linear hypothesis in the linear regressionmodel are modified for measurement error models. The modified tests are still distribution free. Some tests of linear subhypotheses are invariant to the nuisance parameter, others are based on the aligned ranks using the R-estimators. The asymptotic relative efficiencies of tests with respect to tests in model...

متن کامل

Max-type rank tests, U-tests, and adaptive tests for the two-sample location problem - An asymptotic power study

For the two-sample location problem we first consider two types of tests, linear rank tests with various scores, but also some tests based on U-statistics. For both types we construct adaptive tests as well as max-type tests and investigate their asymptotic and finite power properties. It turns out that both the adaptive tests have larger asymptotic power than the max-type tests. For small samp...

متن کامل

Hierarchical multilinear models for multiway data

Reduced-rank decompositions provide descriptions of the variation among the elements of a matrix or array. In such decompositions, the elements of an array are expressed as products of lowdimensional latent factors. This article presents a model-based version of such a decomposition, extending the scope of reduced rank methods to accommodate a variety of data types such as longitudinal social n...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Computational rank-based statistics

نویسندگان

چکیده

منابع مشابه

Bootstrap and fast double bootstrap tests of cointegration rank with financial time series

Low rank estimation of higher order statistics

On the exact distribution of maximally selected rank statistics

Rank tests and regression rank score tests in measurement error models

Max-type rank tests, U-tests, and adaptive tests for the two-sample location problem - An asymptotic power study

Hierarchical multilinear models for multiway data

عنوان ژورنال:

اشتراک گذاری