MetaClustering: discovery of the different sample clusterings in gene expression data.
نویسندگان
چکیده
Clustering of the samples is a standard procedure for the analysis of gene expression data, for instance to discover cancer subtypes. However, more than one biologically meaningful clustering can exist, depending on the genes chosen. We propose here to group the genes in function of the clustering of the samples they fit. This allows to determine directly the different clusterings of the samples present in the data. As a clustering is a structure, genes belonging to the same group are functions of the same structure. Hence, the determination of groups of genes which support the same clustering could also be viewed as the detection of non-linearly linked genes. MetaClustering was applied successfully to simulated data. It also recovered the known clustering of real cancer data, which was impossible using the complete set of genes. Finally, it clustered together cell-cycle genes, showing its ability to group genes related in a non-linear way.
منابع مشابه
MetaClustering: Discovery of The Di erent Sample Clusterings in Gene Expression Data
Clustering of the samples is a standard procedure for the analysis of gene expression data, for instance to discover cancer subtypes. However, more than one biologically meaningful clustering can exist, depending on the genes chosen. We propose here to group the genes in function of the clustering of the samples they t. This allows to determine directly the di erent clusterings of the samples p...
متن کاملP-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis
Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...
متن کاملAnnotation-based Distance Measures for Patient Subgroup Discovery in Clinical Microarray Studies
MOTIVATION Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of co-regulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease entities based on molecular characteristics. Several distance-based cluster algorithms have been sugge...
متن کاملIdentification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis
Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...
متن کاملLatent Clustering on Graphs with Multiple Edge Types
We study clustering on graphs with multiple edge types. Our main motivation is that similarities between objects can be measured in many different metrics, and so allowing graphs with multivariate edges significantly increases modeling power. In this context the clustering problem becomes more challenging. Each edge/metric provides only partial information about the data; recovering full inform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome informatics. International Conference on Genome Informatics
دوره 17 2 شماره
صفحات -
تاریخ انتشار 2006