Nonparametric tests for conditional independence in two-way contingency tables
نویسندگان
چکیده
Consider a two-way contingency table, built on two categorical variables R and S. A fundamental question in this context is whether R and S are independent or not. This over-studied testing problem mostly relies on the chi-square or the likelihood ratio test statistics. The principal drawback of the classical way of doing is that probabilities for an individual to fall in a cell of the table are assumed to be equal from one individual to another, so that we are not treating each individual as such but rather a group of supposed homogeneous units. This is often highly unrealistic, since in most of the practical situations some possibly known characteristics of each individual ought to influence or be associated with R, S, or both, and therefore influence the whole dependence structure of the table. In this case, a more judicious idea seems to analyze the conditional joint distribution of R and S given the vector of covariates, say X, and then to test for the conditional independence between R and S given X. Such a test requires the estimation of the conditional probabilities of each cell, given the values of X. In the literature, a striking fact is that the estimation of conditional probabilities associated with categorical responses, given a vector of covariates, is almost always treated via logistic regression methods, most of the time with very few validation of this parametric assumption. In this work, we first present a nonparametric estimation procedure for the conditional probabilities to fall in each cell of the table. These estimates can be used as such, or be employed to validate a parametric assumption, like the logistic one. Secondly, we propose a generalization of the chi-square and the likelihood ratio tests to the case of testing for conditional independence, based on the above-mentioned nonparametric estimates of the conditional probabilities. The asymptotic law of the proposed test statistics is derived.
منابع مشابه
Likelihood Ratio Tests with Three-Way Tables
Likelihood ratio (LR) tests for association and for interaction are examined for three-way contingency tables, in particular, the widely used 2 2 K tables. Mutual information identities are used to characterize the information decomposition and the logical relations between the omnibus LR test for conditional independence across K strata and its two independent components, the LR tests for homo...
متن کاملThe Strucplot Framework: Visualizing Multi-way Contingency Tables with vcd
This paper has been published in the Journal of Statistical Software (Meyer, Zeileis, and Hornik 2006) and describes the “strucplot” framework for the visualization of multiway contingency tables. Strucplot displays include hierarchical conditional plots such as mosaic, association, and sieve plots, and can be combined into more complex, specialized plots for visualizing conditional independenc...
متن کاملResidual-based Shadings for Visualizing (Conditional) Independence
Residual-based shadings for enhancing mosaic and association plots to visualize independence models for contingency tables are extended in two directions: (a) perceptually uniform Hue-Chroma-Luminance (HCL) colors are used and (b) the result of an associated significance test is coded by the appearance of color in the visualization. For obtaining (a), a general strategy for deriving diverging p...
متن کاملSpeeding up the execution of a large number of statistical tests of independence
A massive amount of conditional independence tests on data must be performed in the problem of learning the structure of probabilistic graphical models when using the independence-based approach. An intermediate step in the computation of independence tests is the construction of contingency tables from the data. In this work we present an intelligent cache of contingency tables that allows the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Multivariate Analysis
دوره 101 شماره
صفحات -
تاریخ انتشار 2010