Data Mining in Proteomics Using Grid Computing
نویسندگان
چکیده
The scope of this chapter is the presentation of Data Mining techniques for knowledge extraction in proteomics, taking into account both the particular features of most proteomics issues (such as data retrieval and system complexity), and the opportunities and constraints found in a Grid environment. The chapter discusses the way new and potentially useful knowledge can be extracted from proteomics data, utilizing Grid resources in a transparent way. Protein classification is introduced as a current research issue in proteomics, which also demonstrates most of the domain – specific traits. An overview of common and custom-made Data Mining algorithms is provided, with emphasis on the specific needs of protein classification problems. A unified methodology is presented for complex Data Mining processes on the Grid, highlighting the different application types and the benefits and drawbacks in each case. Finally, the methodology is validated through real-world case studies, deployed over the EGEE grid environment. DOI: 10.4018/978-1-4666-0879-5.ch4.9
منابع مشابه
Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...
متن کاملImproving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...
متن کاملMS-Analyzer: preprocessing and data mining services for proteomics applications on the Grid
Mass spectrometry proteomics data contain much information about cell functions and disease conditions. The discovery of such information is enabled by the combined use of novel bioinformatics tools and data mining techniques requiring the integration of huge data sources and the composition of different software tools. The main phases of such emerging applications comprise the loading, managem...
متن کاملGrid and High-Performance Computing for Applied Bioinformatics
The beginning of the twenty-first century has been characterized by an explosion of biological information. The avalanche of data grows daily and arises as a consequence of advances in the fields of molecular biology and genomics and proteomics. The challenge for nowadays biologist lies in the de-codification of this huge and complex data, in order to achieve a better understanding of how our g...
متن کاملEfficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015