Testing for zero inflation in count models: Bias correction for the Vuong test
نویسندگان
چکیده
The proportion of zeros in event-count processes may be inflated by an additional mechanism by which zeros are created. This has given rise to statistical models that accommodate zero inflation; these are available in Stata through the zip and zinb commands. The Vuong (1989, Econometrica 57: 307–333) test is regularly used to determine whether estimating a zero-inflation component is appropriate or whether a single-equation count model should be used. The use of the Vuong test in this case is complicated by the fact that zero-inflated models involve the estimation of several more parameters than the single-equation models. Although Vuong (1989, Econometrica 57: 307–333) suggested corrections to the test statistic to address the comparison of models with different numbers of parameters, Stata does not implement any such correction. The result is that the Vuong test used by Stata is biased toward supporting the model with a zero-inflation component, even when no zero inflation exists in the generative process. We provide new Stata commands for computing the Vuong statistic with corrections based on the Akaike and Bayesian (Schwarz) information criteria. In an extensive Monte Carlo study, we illustrate the bias inherent in using the uncorrected Vuong test, and we examine the relative merits of the Akaike and Schwarz corrections. Then, in an empirical example from international relations research, we show that errors in selecting an event-count model can have clear implications for substantive conclusions.
منابع مشابه
Zero-inflated generalized Poisson models with regression effects on the mean, dispersion and zero-inflation level applied to patent outsourcing rates
This paper focuses on an extension of zero-inflated generalized Poisson (ZIGP) regression models for count data. We discuss generalized Poisson (GP) models where dispersion is modelled by an additional model parameter. Moreover, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. In addition to ZIGP regression introduced by Famoye ...
متن کاملAssessment and Selection of Competing Models for Zero-Inflated Microbiome Data
Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through extensive simulations and application to a microbiome study. These methods include standard para...
متن کاملTesting Several Rival Models Using the Extension of Vuong\'s Test and Quasi Clustering
The two main goals in model selection are firstly introducing an approach to test homogeneity of several rival models and secondly selecting a set of reasonable models or estimating the best rival model to the true one. In this paper we extend Vuong's method for several models to cluster them. Based on the working paper of Katayama $(2008)$, we propose an approach to test whether rival models h...
متن کاملروشی نوین در کاهش نوفه رایسین از مقدار بزرگی سیگنال دیفیوژن در تصویربرداری تشدید مغناطیسی (MRI)
The true MR signal intensity extracted from noisy MR magnitude images is biased with the Rician noise caused by noise rectification in the magnitude calculation for low intensity pixels. This noise is more problematic when a quantitative analysis is performed based on the magnitude images with low SNR(<3.0). In such cases, the received signal for both the real and imaginary components will fluc...
متن کاملScore tests for zero-inflation and overdispersion in two-level count data
In a Poisson regression model, where observations are either clustered or represented by repeated measurements of counts, the number of observed zero counts is sometimes greater than the expected frequency by the Poisson distribution and the non-zero part of count data may be overdispersed. The zero-inflated negative binomial (ZINB) mixed regression model is suggested to analyze such data. Prev...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014