File Formats Commonly Used in Mass Spectrometry Proteomics*
نویسنده
چکیده
The application of mass spectrometry (MS) to the analysis of proteomes has enabled the high-throughput identification and abundance measurement of hundreds to thousands of proteins per experiment. However, the formidable informatics challenge associated with analyzing MS data has required a wide variety of data file formats to encode the complex data types associated with MS workflows. These formats encompass the encoding of input instruction for instruments, output products of the instruments, and several levels of information and results used by and produced by the informatics analysis tools. A brief overview of the most common file formats in use today is presented here, along with a discussion of related topics.
منابع مشابه
ProteomeCommons.org IO Framework: reading and writing multiple proteomics data formats
MOTIVATION Effective use of proteomics data, specifically mass spectrometry data, relies on the ability to read and write the many mass spectrometer file formats. Even with mass spectrometer vendor-specific libraries and vendor-neutral file formats, such as mzXML and mzData it can be difficult to extract raw data files in a form suitable for batch processing and basic research. Introduced here ...
متن کاملjmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats
We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for r...
متن کاملA uniform proteomics MS/MS analysis platform utilizing open XML file formats
The analysis of tandem mass (MS/MS) data to identify and quantify proteins is hampered by the heterogeneity of file formats at the raw spectral data, peptide identification, and protein identification levels. Different mass spectrometers output their raw spectral data in a variety of proprietary formats, and alternative methods that assign peptides to MS/MS spectra and infer protein identificat...
متن کاملmspire: mass spectrometry proteomics in Ruby
UNLABELLED Mass spectrometry-based proteomics stands to gain from additional analysis of its data, but its large, complex datasets make demands on speed and memory usage requiring special consideration from scripting languages. The software library 'mspire'-developed in the Ruby programming language-offers quick and memory-efficient readers for standard xml proteomics formats, converters for in...
متن کاملNumerical compression schemes for proteomics mass spectrometry data.
The open XML format mzML, used for representation of MS data, is pivotal for the development of platform-independent MS analysis software. Although conversion from vendor formats to mzML must take place on a platform on which the vendor libraries are available (i.e. Windows), once mzML files have been generated, they can be used on any platform. However, the mzML format has turned out to be les...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 11 شماره
صفحات -
تاریخ انتشار 2012