Double self-organizing maps to cluster gene expression data

نویسندگان

  • Dali Wang
  • Habtom W. Ressom
  • Mohamad T. Musavi
  • Christian Domnisoru
چکیده

Clustering is a very useful and important technique for analyzing gene expression data. Self-organizing map (SOM) is one of the most useful clustering algorithms. SOM requires the number of clusters to be one of the initialization parameters prior to clustering. However, this information is unavailable in most cases, particularly in gene expression data. Thus, the validation results from SOM are commonly employed to choose the appropriate number of clusters. This approach is very inconvenient and time-consuming. This paper applies a novel model of SOM, called double self-organizing map (DSOM) to cluster gene expression data. DSOM helps to find the appropriate number of clusters by clearly and visually depicting the appropriate number of clusters. We use DSOM to cluster an artificial data set and two kinds of real gene expression data sets. To validate our results, we employed a novel validation technique, which is known as figure of merit (FOM).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advances in cluster analysis of microarray data

Clustering genes into biological meaningful groups according to their pattern of expression is a main technique of microarray data analysis, based on the assumption that similarity in gene expression implies some form of regulatory or functional similarity. We give an overview of various clustering techniques, including conventional clustering methods (such as hierarchical clustering, k-means c...

متن کامل

Master Thesis in Bioinformatics Clustering Genes by Using Different Types of Genomic Data and Self-Organizing Maps

The aim of the project was to identify biologically relevant novel gene clusters by using combined genomic data instead of using only gene expression data in isolation. The clustering algorithm based on self-organizing maps (Kasturi et al., 2005) was extended and implemented in order to use gene location data together with the gene expression and the motif occurrence data for gene clustering. A...

متن کامل

Co-clustering and visualization of gene expression data and gene ontology terms for Saccharomyces cerevisiae using self-organizing maps

We propose a novel co-clustering algorithm that is based on self-organizing maps (SOMs). The method is applied to group yeast (Saccharomyces cerevisiae) genes according to both expression profiles and Gene Ontology (GO) annotations. The combination of multiple databases is supposed to provide a better biological definition and separation of gene clusters. We compare different levels of genome-w...

متن کامل

Analysis of Large-scale Gene Expression Data

The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.

متن کامل

Curve-Based Clustering of Time Course Gene Expression Data Using Self-Organizing Maps

There is an increasing interest in clustering time course gene expression data to investigate a wide range of biological processes. However, developing a clustering algorithm ideal for time course gene express data is still challenging. As timing is an important factor in defining true clusters, a clustering algorithm shall explore expression correlations between time points in order to achieve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002