Paradigmatic Cascades: a Linguistically Sound Model of Pronunciation by Analogy

نویسنده

  • François Yvon
چکیده

We present and experimentally evaluate a new model of pronunciation by analogy: the paradigmatic cascades model. Given a pronunciation lexicon, this algorithm first extracts the most productive paradigmatic mappings in the graphemic domain, and pairs them statistically with their correlate(s) in the phonemic domain. These mappings are used to search and retrieve in the lexical database the most promising analog of unseen words. We finally apply to the analogs pronunciation the correlated series of mappings in the phonemic domain to get the desired pronunciation. 1 M o t i v a t i o n Psychological models of reading aloud traditionally assume the existence of two separate routes for converting print to sound: a direct lexical route, which is used to read familiar words, and a dual route relying upon abstract letter-to-sound rules to pronounce previously unseen words (Coltheart, 1978; Coltheart et al., 1993). This view has been challenged by a number of authors (e.g. (Glushsko, 1981)), who claim that the pronunciation process of every word, familiar or unknown, could be accounted for in a unified framework. These single-route models crucially suggest that the pronunciation of unknown words results from the parallel activation of similar lexical items (the lexical neighbours). This idea has been tentatively implemented both into various symbolic analogy-based algorithms (e.g. (Dedina and Nusbaum, 1991; Sullivan and Damper, 1992)) and into connectionist pronunciation devices (e.g. (Seidenberg and McClelland, 1989)). The basic idea of these analogy-based models is to pronounce an unknown word x by recombining pronunciations of lexical items sharing common subparts with x. To illustrate this strategy, Dedina and Nussbaum show how the pronunciation of the sequence lop in the pseudo-word blope is analogized with the pronunciation of the same sequence in sloping. As there exists more than one way to recombine segments of lexical items, Dedina and Nussbaum's algorithm favors recombinations including large substrings of existing words. In this model, the similarity between two words is thus implicitely defined as a function of the length of their common subparts: the longer the common part, the better the analogy. This conception of analogical processes has an important consequence: it offers, as Damper and Eastmona ((Damper and Eastmond, 1996)) state it, "no principled way of deciding the orthographic neighbouts of a novel word which are deemed to influence its pronunciation (...)". For example, in the model proposed by Dedina and Nusbaum, any word having a common orthographic substring with the unknown word is likely to contribute to its pronunciation, which increases the number of lexical neighbouts far beyond acceptable limits (in the case of blope, this neighbourhood would contain every English word starting in bl, or ending in ope, etc). From a computational standpoint, implementing the recombination strategy requires a one-toone alignment between the lexical graphemic and phonemic representations, where each grapheme is matched with the corresponding phoneme (a null symbol is used to account for the cases where the lengths of these representations differ). This alignment makes it possible to retrieve, for any graphemic substring of a given lexical item, the corresponding phonemic string, at the cost however of an unmotivated complexification of lexical representations. In comparison, the paradigmati c cascades model (PCP for short) promotes an alternative view of analogical processes, which relies upon a linguistically motivated similarity measure between words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronunciation Modelling of Foreign Words for Sepedi ASR

This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to mode...

متن کامل

Improving pronunciation by analogy for text-to-speech applications

This paper extends previous work on pronunciation by analogy (PbA) in several directions. PbA is a data-driven method for converting letters to sound, with potential application to next-generation text-to-speech systems. We experiment with a range of methods for matching letter patterns in input words to those in the system dictionary when building a pronunciation lattice. We give preliminary c...

متن کامل

Learning linguistically valid pronunciations from acoustic data

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how “linguistically reasonable” the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of w...

متن کامل

Learning Linguistically Valid Pronun

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how “linguistically reasonable” the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of w...

متن کامل

Pronunciation by Analogy: Impact of Implementational Choices on Performance

Pronunciation by analogy (PbA) is an emerging, data-driven technique with potential application in text-to-speech (TTS) systems, as well as being an influential psychological model of reading aloud. The underlying idea is that a pronunciation for an unknown word (i.e. one not in the dictionary, or lexicon, of the human or machine ‘reader’) is assembled by matching substrings of the input to sub...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997