Signed Support Recovery for Single Index Models in High- Dimensions
نویسندگان
چکیده
In this paper we study the support recovery problem for single index models Y = f(Xβ, ε), where f is an unknown link function, X ∼ Np(0, Ip) and β is an s-sparse unit vector such that βi ∈ {± 1 s , 0}. In particular, we look into the performance of two computationally inexpensive algorithms: (a) the diagonal thresholding sliced inverse regression (DT-SIR) introduced by Lin et al. (2015); and (b) a semi-definite programming (SDP) approach inspired by Amini & Wainwright (2008). When s = O(p1−δ) for some δ > 0, we demonstrate that both procedures can succeed in recovering the support of β as long as the rescaled sample size Γ = n s log(p−s) is larger than a certain critical threshold. On the other hand, when Γ is smaller than a critical value, any algorithm fails to recover the support with probability at least 12 asymptotically. In other words, we demonstrate that both DT-SIR and the SDP approach are optimal (up to a scalar) for recovering the support of β in terms of sample size. We provide extensive simulations, as well as a real dataset application to help verify our theoretical observations.
منابع مشابه
L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs
It is known that for a certain class of single index models (SIMs) [Formula: see text], support recovery is impossible when X ~ 𝒩(0, 𝕀 p×p ) and a model complexity adjusted sample size is below a critical threshold. Recently, optimal algorithms based on Sliced Inverse Regression (SIR) were suggested. These algorithms work provably under the assumption that the design X comes from an i.i.d. Gaus...
متن کاملRobust Structured Estimation with Single-Index Models
In this paper, we investigate general single-index models (SIMs) in high dimensions. Based on U -statistics, we propose two types of robust estimators for the recovery of model parameters, which can be viewed as generalizations of several existing algorithms for one-bit compressed sensing (1-bit CS). With minimal assumption on noise, the statistical guarantees are established for the generalize...
متن کاملA Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کاملNEW MODELS AND ALGORITHMS FOR SOLUTIONS OF SINGLE-SIGNED FULLY FUZZY LR LINEAR SYSTEMS
We present a model and propose an approach to compute an approximate solution of Fully Fuzzy Linear System $(FFLS)$ of equations in which all the components of the coefficient matrix are either nonnegative or nonpositive. First, in discussing an $FFLS$ with a nonnegative coefficient matrix, we consider an equivalent $FFLS$ by using an appropriate permutation to simplify fuzzy multiplications. T...
متن کاملLearning Single Index Models in High Dimensions
Single Index Models (SIMs) are simple yet flexible semi-parametric models for classification and regression. Response variables are modeled as a nonlinear, monotonic function of a linear combination of features. Estimation in this context requires learning both the feature weights, and the nonlinear function. While methods have been described to learn SIMs in the low dimensional regime, a metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016