A Clustering Method for Discovering Patterns Using Gene Regulatory Processes

نویسندگان

Siyoung Park

Daewoo Choi

Chi-Hyuck Jun

چکیده

Clustering is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes (dimensions). The k-means and hierarchical as well as self-organizing maps have all been used for clustering expression profiles and a number of algorithms have been developed for expression data and applied to analyze it. These Clustering methods usually use metric distance for similarity measure. Correlation coefficient is also used but has a problem that it removes difference attributable to both the mean and the dispersion of the observations. Moreover, it may be unreasonable that every observation is assigned to one of clusters when the purpose is to find groups with similar pattern. Alter et al. [1] show that several significant eigengenes and the corresponding eigenarrays capture most of the expression information in field of genetics and some of the eigengenes represent independent regulatory programs or processes from its expression pattern across all arrays. Normalizing the data by filtering out the eigengenes (and the corresponding eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different expression. Such normalization may improve any further analysis of the expression data. Q-mode factor analysis has been used to find groups like clustering analysis and could be a good method to find patterns. However, this approach to clustering is plagued with a number of problems [3]. Genes with similar expression profiles may have something in common in their regulatory mechanisms. In this study, Q-mode factor analysis is used to model the gene regulatory processes which control genes and gene products and we modify the Q-mode factor analysis for discovering useful patterns in gene expression data. As a result of the factor modeling of gene expression data, our method can improve the result of clustering by removing noises and produce characteristic values of expression data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering biological processes from microarray data using independent component analysis

We propose a hypothesis-free methodology for discovering genome-wide expression patterns specific to underlying biological processes from DNA microarray expression data. We apply linear and nonlinear independent component analysis (ICA) as a tool for decomposing microarray data into statistically independent components. Each component represents a gene expression pattern of a putative underlyin...

متن کامل

Discovering Distinct Patterns in Gene Expression Profiles

Traditional analysis of gene expression profiles use clustering to find groups of coexpressed genes which have similar expression patterns. However clustering is time consuming and could be diffcult for very large scale dataset. We proposed the idea of Discovering Distinct Patterns (DDP) in gene expression profiles. Since patterns showing by the gene expressions reveal their regulate mechanisms...

متن کامل

BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes

The development of genome sequencing and DNA microarray analysis of gene expression gives rise to the demand for data-mining tools. BioProspector, a C program using a Gibbs sampling strategy, examines the upstream region of genes in the same gene expression pattern group and looks for regulatory sequence motifs. BioProspector uses zero to third-order Markov background models whose parameters ar...

متن کامل

BRANE Clust: Cluster-Assisted Gene Regulatory Network Inference Refinement.

Discovering meaningful gene interactions is crucial for the identification of novel regulatory processes in cells. Building accurately the related graphs remains challenging due to the large number of possible solutions from available data. Nonetheless, enforcing a priori on the graph structure, such as modularity, may reduce network indeterminacy issues. BRANE Clust (Biologically-Related A pri...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

A Clustering Method for Discovering Patterns Using Gene Regulatory Processes

نویسندگان

چکیده

منابع مشابه

Discovering biological processes from microarray data using independent component analysis

Discovering Distinct Patterns in Gene Expression Profiles

BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes

BRANE Clust: Cluster-Assisted Gene Regulatory Network Inference Refinement.

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

عنوان ژورنال:

اشتراک گذاری