Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features

نویسندگان

  • Salil Deena
  • Raymond W. M. Ng
  • Pranava Swaroop Madhyastha
  • Lucia Specia
  • Thomas Hain
چکیده

Recurrent neural network language models (RNNLMs) can be augmented with auxiliary features, which can provide an extra modality on top of the words. It has been found that RNNLMs perform best when trained on a large corpus of generic text and then fine-tuned on text corresponding to the sub-domain for which it is to be applied. However, in many cases the auxiliary features are available for the sub-domain text but not for the generic text. In such cases, semi-supervised techniques can be used to infer such features for the generic text data such that the RNNLM can be trained and then fine-tuned on the available in-domain data with corresponding auxiliary features. In this paper, several novel approaches are investigated for dealing with the semi-supervised adaptation of RNNLMs with auxiliary features as input. These approaches include: using zero features during training to mask the weights of the feature sub-network; adding the feature sub-network only at the time of fine-tuning; deriving the features using a parametric model and; back-propagating to infer the features on the generic text. These approaches are investigated and results are reported both in terms of PPL and WER on a multi-genre broadcast ASR task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition

Recurrent neural network language models (RNNLMs) have consistently outperformed n-gram language models when used in automatic speech recognition (ASR). This is because RNNLMs provide robust parameter estimation through the use of a continuous-space representation of words, and can generally model longer context dependencies than n-grams. The adaptation of RNNLMs to new domains remains an activ...

متن کامل

Domain Adaptation as Learning with Auxiliary Information

Many non-standard learning models, such as semi-supervised learning or transfer learning can be viewed as learning with auxiliary information. We discuss how to formally analyze the benefits of auxiliary information and suggest a formal framework for this. Further, we consider a particular case of learning with auxiliary information, namely regularized least squares regression with an extra inp...

متن کامل

Unsupervised Adaptation of Recurrent Neural Network Language Models

Recurrent neural network language models (RNNLMs) have been shown to consistently improve Word Error Rates (WERs) of large vocabulary speech recognition systems employing ngram LMs. In this paper we investigate supervised and unsupervised discriminative adaptation of RNNLMs in a broadcast transcription task to target domains defined by either genre or show. We have explored two approaches based...

متن کامل

Learning Transferrable Representations for Unsupervised Domain Adaptation

Supervised learning with large scale labelled datasets and deep layered models has caused a paradigm shift in diverse areas in learning and recognition. However, this approach still suffers from generalization issues under the presence of a domain shift between the training and the test data distribution. Since unsupervised domain adaptation algorithms directly address this domain shift problem...

متن کامل

Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains. Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the labe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017