Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens
نویسندگان
چکیده
MOTIVATION Application of mass spectrometry in proteomics is a breakthrough in high-throughput analyses. Early applications have focused on protein expression profiles to differentiate among various types of tissue samples (e.g. normal versus tumor). Here our goal is to use mass spectra to differentiate bacterial species using whole-organism samples. The raw spectra are similar to spectra of tissue samples, raising some of the same statistical issues (e.g. non-uniform baselines and higher noise associated with higher baseline), but are substantially noisier. As a result, new preprocessing procedures are required before these spectra can be used for statistical classification. RESULTS In this study, we introduce novel preprocessing steps that can be used with any mass spectra. These comprise a standardization step and a denoising step. The noise level for each spectrum is determined using only data from that spectrum. Only spectral features that exceed a threshold defined by the noise level are subsequently used for classification. Using this approach, we trained the Random Forest program to classify 240 mass spectra into four bacterial types. The method resulted in zero prediction errors in the training samples and in two test datasets having 240 and 300 spectra, respectively.
منابع مشابه
A COMPARATIVE ANALYSIS OF WAVELET-BASED FEMG SIGNAL DENOISING WITH THRESHOLD FUNCTIONS AND FACIAL EXPRESSION CLASSIFICATION USING SVM AND LSSVM
This work presents a technique for the analysis of Facial Electromyogram signal activities to classify five different facial expressions for Computer-Muscle Interfacing applications. Facial Electromyogram (FEMG) is a technique for recording the asynchronous activation of neuronal inside the face muscles with non-invasive electrodes. FEMG pattern recognition is a difficult task for the researche...
متن کاملPreprocessing of tandem mass spectra using machine learning methods
Protein identification has been more helpful than before in the diagnosis and treatment of many diseases, such as cancer, heart disease and HIV. Tandem mass spectrometry is a powerful tool for protein identification. In a typical experiment, proteins are broken into small amino acid oligomers called peptides. By determining the amino acid sequence of several peptides of a protein, its whole ami...
متن کاملA Bayesian approach for image denoising in MRI
Magnetic Resonance Imaging (MRI) is a notable medical imaging technique that is based on Nuclear Magnetic Resonance (NMR). MRI is a safe imaging method with high contrast between soft tissues, which made it the most popular imaging technique in clinical applications. MR Imagechr('39')s visual quality plays a vital role in medical diagnostics that can be severely corrupted by existing noise duri...
متن کاملFT-Raman Spectra of Saffron (Crocus Stivus L.); A Possible Method for Standardization of Saffron
FT-Raman Spectra of Saffron (crocus sativus L.) with a partial assignment is reported. Based on the Raman data, it is concluded that main pigments in saffron are crocins and crocetin. It is proposed that the quickly attainable FT-Raman spectrum of solid saffron, may be used as a means of saffron standardization.
متن کاملComparative Analysis of Image Denoising Methods Based on Wavelet Transform and Threshold Functions
There are many unavoidable noise interferences in image acquisition and transmission. To make it better for subsequent processing, the noise in the image should be removed in advance. There are many kinds of image noises, mainly including salt and pepper noise and Gaussian noise. This paper focuses on the research of the Gaussian noise removal. It introduces many wavelet threshold denoising alg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 20 17 شماره
صفحات -
تاریخ انتشار 2004