Robust biclustering by sparse singular value decomposition incorporating stability selection

نویسندگان

  • Martin Sill
  • Sebastian Kaiser
  • Axel Benner
  • Annette Kopp-Schneider
چکیده

MOTIVATION Over the past decade, several biclustering approaches have been published in the field of gene expression data analysis. Despite of huge diversity regarding the mathematical concepts of the different biclustering methods, many of them can be related to the singular value decomposition (SVD). Recently, a sparse SVD approach (SSVD) has been proposed to reveal biclusters in gene expression data. In this article, we propose to incorporate stability selection to improve this method. Stability selection is a subsampling-based variable selection that allows to control Type I error rates. The here proposed S4VD algorithm incorporates this subsampling approach to find stable biclusters, and to estimate the selection probabilities of genes and samples to belong to the biclusters. RESULTS So far, the S4VD method is the first biclustering approach that takes the cluster stability regarding perturbations of the data into account. Application of the S4VD algorithm to a lung cancer microarray dataset revealed biclusters that correspond to coregulated genes associated with cancer subtypes. Marker genes for different lung cancer subtypes showed high selection probabilities to belong to the corresponding biclusters. Moreover, the genes associated with the biclusters belong to significantly enriched cancer-related Gene Ontology categories. In a simulation study, the S4VD algorithm outperformed the SSVD algorithm and two other SVD-related biclustering methods in recovering artificial biclusters and in being robust to noisy data. AVAILABILITY R-Code of the S4VD algorithm as well as a documentation can be found at http://s4vd.r-forge.r-project.org/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biclustering via sparse singular value decomposition.

Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row-column associations within high-dimensional data matrices. SSVD seeks a low-rank, checkerboard structured matrix approximation to data matrices. The desired checkerboard structure is achieved by forcing both the left- and right-singular vectors to be sparse...

متن کامل

Sparse Principal Component Analysis Incorporating Stability Selection

Principal component analysis (PCA) is a popular dimension reduction method that approximates a numerical data matrix by seeking principal components (PC), i.e. linear combinations of variables that captures maximal variance. Since each PC is a linear combination of all variables of a data set, interpretation of the PCs can be difficult, especially in high-dimensional data. In order to find ’spa...

متن کامل

L0-norm Sparse Graph-regularized SVD for Biclustering

Learning the “blocking” structure is a central challenge for high dimensional data (e.g., gene expression data). In [Lee et al., 2010], a sparse singular value decomposition (SVD) has been used as a biclustering tool to achieve this goal. However, this model ignores the structural information between variables (e.g., gene interaction graph). Although typical graph-regularized norm can incorpora...

متن کامل

Incorporating Prior Information in Compressive Online Robust Principal Component Analysis

We consider an online version of the robust Principle Component Analysis (PCA), which arises naturally in timevarying source separations such as video foreground-background separation. This paper proposes a compressive online robust PCA with prior information for recursively separating a sequences of frames into sparse and low-rank components from a small set of measurements. In contrast to con...

متن کامل

Functional Singular Component Analysis

Aiming at quantifying the dependency of pairs of functional data (X,Y ), we develop the concept of functional singular value decomposition for covariance and functional singular component analysis, building on the concept of “canonical expansion” of compact operators in functional analysis. We demonstrate the estimation of the resulting singular values, functions and components for the practica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 27 15  شماره 

صفحات  -

تاریخ انتشار 2011