Partitional Clustering of Protein Sequences - An Inductive Logic Programming Approach
نویسندگان
چکیده
We present a novel approach to cluster sets of protein sequences, based on Inductive Logic Programming (ILP). Preliminary results show that the method proposed produces understandable descriptions/explanations of the clusters. Furthermore, it can be used as a knowledge elicitation tool to explain clusters proposed by other clustering approaches, such as standard phylogenetic programs.
منابع مشابه
Relational Sequence Alignments and Logos
The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequence...
متن کاملTop-Down Induction of Clustering Trees
An approach to clustering is presented that adapts the basic top-down induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for first order clustering. The TIC system employs the first order logical decision tree representation o...
متن کامل"Say EM" for Selecting Probabilistic Models for Logical Sequences
Many real world sequences such as protein secondary structures or shell logs exhibit a rich internal structures. Traditional probabilistic models of sequences, however, consider sequences of flat symbols only. Logical hidden Markov models have been proposed as one solution. They deal with logical sequences, i.e., sequences over an alphabet of logical atoms. This comes at the expense of a more c...
متن کاملAccurate Prediction of Protein Functional Class from Sequence in the M. tuberculosis and E. coli Genomes using Data Mining
(2) Author to whom correspondence should be sent. Abstract The analysis of genomics data needs to become as automated as its generation. Here we present a novel data-mining approach to predicting protein functional class from sequence. This method is based on a combination of inductive logic programming clustering and rule learning. We demonstrate the effectiveness of this approach on the M. tu...
متن کاملLearning functional logic classification concepts from databases
In this paper we address the possibilities, advantages and shortcomings of addressing different data-mining problems with the Inductive Functional Logic Programming (IFLP) paradigm. As a functional extension of the Inductive Logic Programming (ILP) approach, IFLP has all the advantages of the latter but the potential of a more natural representation language for classification, clustering and f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009