Benefits from Variational Regularization in Language Models
نویسندگان
چکیده
Representations from common pre-trained language models have been shown to suffer the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy In analogy with variational autoencoders, we suggest applying token-level loss Transformer architecture and optimizing standard deviation of prior distribution function as model parameter increase isotropy. The resulting space is complete interpretable: any given point valid embedding decoded into text again. allows for manipulations such paraphrase generation directly Surprisingly, features extracted at sentence level also show competitive results on benchmark classification tasks.
منابع مشابه
Global Solutions of Variational Models with Convex Regularization
We propose an algorithmic framework for computing global solutions of variational models with convex regularity terms that permit quite arbitrary data terms. While the minimization of variational problems with convex data and regularity terms is straightforward (using, for example, gradient descent), this is no longer trivial for functionals with nonconvex data terms. Using the theoretical fram...
متن کاملImplicit Regularization in Variational Bayesian Matrix Factorization
Matrix factorization into the product of lowrank matrices induces non-identifiability, i.e., the mapping between the target matrix and factorized matrices is not one-to-one. In this paper, we theoretically investigate the influence of non-identifiability on Bayesian matrix factorization. More specifically, we show that a variational Bayesian method involves regularization effect even when the p...
متن کاملDiscretization of variational regularization in Banach spaces
Consider a nonlinear ill-posed operator equation F (u) = y where F is defined on a Banach space X. In this paper we analyze finite dimensional variational regularization, which takes into account operator approximations and noisy data. As shown in the literature, depending on the setting, convergence of the regularized solutions of the finite dimensional problems can be with respect to the stro...
متن کاملConvergence Rates of Convex Variational Regularization
The aim of this paper is to provide quantitative estimates for the minimizers of non-quadratic regularization problems in terms of the regularization parameter respectively the noise level. As usual for illposed inverse problems, these estimates can be obtained only under additional smoothness assumptions on the data, so-called source conditions, which we identify with the existence of Lagrange...
متن کاملLeast Square Variational Bayesian Autoencoder with Regularization
In recent years Variation Autoencoders have become one of the most popular unsupervised learning of complicated distributions. Variational Autoencoder (VAE) provides more efficient reconstructive performance over a traditional autoencoder. Variational auto enocders make better approximaiton than MCMC. The VAE defines a generative process in terms of ancestral sampling through a cascade of hidde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine learning and knowledge extraction
سال: 2022
ISSN: ['2504-4990']
DOI: https://doi.org/10.3390/make4020025