A Sequence Alignment Model Based on the Averaged Perceptron
نویسندگان
چکیده
We describe a discriminatively trained sequence alignment model based on the averaged perceptron. In common with other approaches to sequence modeling using perceptrons, and in contrast with comparable generative models, this model permits and transparently exploits arbitrary features of input strings. The simplicity of perceptron training lends more versatility than comparable approaches, allowing the model to be applied to a variety of problem types for which a learned edit model might be useful. We enumerate some of these problem types, describe a training procedure for each, and evaluate the model’s performance on several problems. We show that the proposed model performs at least as well as an approach based on statistical machine translation on two problems of name transliteration, and provide evidence that the combination of the two approaches promises further improvement.
منابع مشابه
A Discriminative Learning Model for Coordinate Conjunctions
We propose a sequence-alignment based method for detecting and disambiguating coordinate conjunctions. In this method, averaged perceptron learning is used to adapt the substitution matrix to the training data drawn from the target language and domain. To reduce the cost of training data construction, our method accepts training examples in which complete word-by-word alignment labels are missi...
متن کاملAn Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملA generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملCoordinate Structure Analysis with Global Structural Constraints and Alignment-Based Local Features
We propose a hybrid approach to coordinate structure analysis that combines a simple grammar to ensure consistent global structure of coordinations in a sentence, and features based on sequence alignment to capture local symmetry of conjuncts. The weight of the alignmentbased features, which in turn determines the score of coordinate structures, is optimized by perceptron training on a given co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007