Improving ASR processing of ungrammatical utterances through grammatical error modeling
نویسندگان
چکیده
Automatic speech recognition (ASR) of non-native utterances with grammatical errors is problematic. A new method which makes it possible to better recognize such utterances is presented in the current paper. It can be briefly summarized as follows: extract error patterns automatically from a learner corpus, formulate rewrite rules for these syntactic and morphological errors, build finite state grammars (FSGs), and use these FSGs as language models in ASR systems. All rules used in isolation and in different combinations yield lower word error rates (WERs).
منابع مشابه
Negative evidence in language acquisition.
Whether children require "negative evidence" (i.e., information about which strings of words are not grammatical sentences) to eliminate their ungrammatical utterances is a central question in language acquisition because, lacking negative evidence, a child would require internal mechanisms to unlearn grammatical errors. Several recent studies argue that parents provide noisy feedback, that is,...
متن کاملTreebanks Gone Bad Parser Evaluation and Retraining using a Treebank of Ungrammatical Sentences
This article describes how a treebank of ungrammatical sentences can be created from a treebank of well-formed sentences. The treebank creation procedure involves the automatic introduction of frequently occurring grammatical errors into the sentences in an existing treebank, and the minimal transformation of the original analyses in the treebank so that they describe the newly created ill-form...
متن کاملA hybrid approach for correcting grammatical errors
This paper presents a hybrid approach for correcting grammatical errors in the sentences uttered by Korean learners of English. The error correction system plays an important role in GenieTutor, which is a dialogue-based English learning system designed to teach English to Korean students. During the talk with GenieTutor, grammatical error feedback and better expressions are offered to learners...
متن کاملThe Effect of Multiple Grammatical Errors on Processing Non-Native Writing
In this work, we estimate the deterioration of NLP processing given an estimate of the amount and nature of grammatical errors in a text. From a corpus of essays written by English-language learners, we extract ungrammatical sentences, controlling the number and types of errors in each sentence. We focus on six categories of errors that are commonly made by English-language learners, and consid...
متن کاملGenERRate: Generating Errors for Use in Grammatical Error Detection
This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011