Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives

نویسندگان

  • Sébastien Déjean
  • Pascal G. P. Martin
  • Alain Baccini
  • Philippe Besse
چکیده

Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy clustering of time series gene expression data with cubic-spline

Data clustering techniques have been applied to extract information from gene expression data for two decades. A large volume of novel clustering algorithms have been developed and achieved great success. However, due to the diverse structures and intensive noise, there is no reliable clustering approach can be applied to all gene expression data. In this paper, we aim to the feature of high no...

متن کامل

Curve-Based Clustering of Time Course Gene Expression Data Using Self-Organizing Maps

There is an increasing interest in clustering time course gene expression data to investigate a wide range of biological processes. However, developing a clustering algorithm ideal for time course gene express data is still challenging. As timing is an important factor in defining true clusters, a clustering algorithm shall explore expression correlations between time points in order to achieve...

متن کامل

Missing Value Estimation in DNA Microarrays Using B-Splines

Gene expression profiles generated by the highthroughput microarray experiments are usually in the form of large matrices with high dimensionality. Unfortunately, microarray experiments can generate data sets with multiple missing values, which significantly affect the performance of subsequent statistical analysis and machine learning algorithms. Numerous imputation algorithms have been propos...

متن کامل

Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles

Genes with similar expression profiles are expected to be functionally related or co-regulated. In this direction, clustering microarray time-series data via pairwise alignment of piece-wise linear profiles has been recently introduced. We propose a k-means clustering approach based on a multiple alignment of natural cubic spline representations of gene expression profiles. The multiple alignme...

متن کامل

Using single-index ODEs to study dynamic gene regulatory network

With the development of biotechnology, high-throughput studies on protein-protein, protein-gene, and gene-gene interactions become possible and attract remarkable attention. To explore the interactions in dynamic gene regulatory networks, we propose a single-index ordinary differential equation (ODE) model and develop a variable selection procedure. We employ the smoothly clipped absolute devia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2007  شماره 

صفحات  -

تاریخ انتشار 2007