Benefits from Variational Regularization in Language Models

نویسندگان

چکیده

Representations from common pre-trained language models have been shown to suffer the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy In analogy with variational autoencoders, we suggest applying token-level loss Transformer architecture and optimizing standard deviation of prior distribution function as model parameter increase isotropy. The resulting space is complete interpretable: any given point valid embedding decoded into text again. allows for manipulations such paraphrase generation directly Surprisingly, features extracted at sentence level also show competitive results on benchmark classification tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Solutions of Variational Models with Convex Regularization

We propose an algorithmic framework for computing global solutions of variational models with convex regularity terms that permit quite arbitrary data terms. While the minimization of variational problems with convex data and regularity terms is straightforward (using, for example, gradient descent), this is no longer trivial for functionals with nonconvex data terms. Using the theoretical fram...

متن کامل

Implicit Regularization in Variational Bayesian Matrix Factorization

Matrix factorization into the product of lowrank matrices induces non-identifiability, i.e., the mapping between the target matrix and factorized matrices is not one-to-one. In this paper, we theoretically investigate the influence of non-identifiability on Bayesian matrix factorization. More specifically, we show that a variational Bayesian method involves regularization effect even when the p...

متن کامل

Discretization of variational regularization in Banach spaces

Consider a nonlinear ill-posed operator equation F (u) = y where F is defined on a Banach space X. In this paper we analyze finite dimensional variational regularization, which takes into account operator approximations and noisy data. As shown in the literature, depending on the setting, convergence of the regularized solutions of the finite dimensional problems can be with respect to the stro...

متن کامل

Convergence Rates of Convex Variational Regularization

The aim of this paper is to provide quantitative estimates for the minimizers of non-quadratic regularization problems in terms of the regularization parameter respectively the noise level. As usual for illposed inverse problems, these estimates can be obtained only under additional smoothness assumptions on the data, so-called source conditions, which we identify with the existence of Lagrange...

متن کامل

Least Square Variational Bayesian Autoencoder with Regularization

In recent years Variation Autoencoders have become one of the most popular unsupervised learning of complicated distributions. Variational Autoencoder (VAE) provides more efficient reconstructive performance over a traditional autoencoder. Variational auto enocders make better approximaiton than MCMC. The VAE defines a generative process in terms of ancestral sampling through a cascade of hidde...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning and knowledge extraction

سال: 2022

ISSN: ['2504-4990']

DOI: https://doi.org/10.3390/make4020025