Log-Linear RNNs: Towards Recurrent Neural Networks with Flexible Prior Knowledge
نویسندگان
چکیده
We introduce LL-RNNs (Log-Linear RNNs), an extension of Recurrent Neural Networks that replaces the softmax output layer by a log-linear output layer, of which the softmax is a special case. This conceptually simple move has two main advantages. First, it allows the learner to combat training data sparsity by allowing it to model words (or more generally, output symbols) as complex combinations of attributes without requiring that each combination is directly observed in the training data (as the softmax does). Second, it permits the inclusion of flexible prior knowledge in the form of a priori specified modular features, where the neural network component learns to dynamically control the weights of a log-linear distribution exploiting these features. We conduct experiments in the domain of language modelling of French, that exploit morphological prior knowledge and show an important decrease in perplexity relative to a baseline RNN. We provide other motivating iillustrations, and finally argue that the log-linear and the neural-network components contribute complementary strengths to the LL-RNN: the LL aspect allows the model to incorporate rich prior knowledge, while the NN aspect, according to the “representation learning” paradigm, allows the model to discover novel combination of characteristics. This is an updated version of the e-print arXiv:1607.02467, in particular now including experiments. 1 ar X iv :1 60 7. 02 46 7v 2 [ cs .A I] 1 6 D ec 2 01 6
منابع مشابه
A Recurrent Neural Network Model for Solving Linear Semidefinite Programming
In this paper we solve a wide rang of Semidefinite Programming (SDP) Problem by using Recurrent Neural Networks (RNNs). SDP is an important numerical tool for analysis and synthesis in systems and control theory. First we reformulate the problem to a linear programming problem, second we reformulate it to a first order system of ordinary differential equations. Then a recurrent neural network...
متن کاملOn Fast Dropout and its Applicability to Recurrent Networks
Recurrent Neural Networks (RNNs) are rich models for the processing of sequential data. Recent work on advancing the state of the art has been focused on the optimization or modelling of RNNs, mostly motivated by adressing the problems of the vanishing and exploding gradients. The control of overfitting has seen considerably less attention. This paper contributes to that by analyzing fast dropo...
متن کاملArchitectural Bias in Recurrent Neural Networks - Fractal Analysis
We have recently shown that when initialized with “small” weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tiňo, 2002; Tiňo, Čerňanský & Beňušková, 2002; Tiňo, Čerňanský & Beňušková, 2002a). Following ...
متن کاملFault Detection and Location in DC Microgrids by Recurrent Neural Networks and Decision Tree Classifier
Microgrids have played an important role in distribution networks during recent years. DC microgrids are very popular among researchers because of their benefits. Protection is one of the significant challenges in the way of microgrids progress. As a result, in this paper, a fault detection and location scheme for DC microgrids is proposed. Due to advances in Artificial Intelligence (AI) and s...
متن کاملReward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems
Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real user...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1607.02467 شماره
صفحات -
تاریخ انتشار 2016