Vapnik-chervonenkis Dimension 1 Vapnik-chervonenkis Dimension

نویسنده

Hung Q. Ngo

چکیده

Valiant’s theorem from the previous lecture is meaningless for infinite hypothesis classes, or even classes with more than exponential size. In 1968, Vladimir Vapnik and Alexey Chervonenkis wrote a very original and influential paper (in Russian) [5, 6] which allows us to estimate the sample complexity for infinite hypothesis classes too. The idea is that the size of the hypothesis class is a poor measure of how “complex” or how “expressive” the hypothesis class really is. A better measure is defined, called the VC-dimension (VCD) of a function class. Then, a version of Valiant’s theorem is proved with respect to the VCD ofH, which can be finite for many commonly used infinite hypothesis classH. (More technically, Vapnik-Chervonenkis used VCD to derive bounds for expected loss given empirical loss; more on this point later.) Roughly speaking, the VC-dimension of a function (i.e. hypothesis) class is the maximum number of data points for which, no matter how we label them (with 0/1), there is always a hypothesis in the class which perfectly explains the labeling. This measure is a much better indicator of the model’s capability than the number of parameters used to describe the models. Blumer et al. [1] first brought VCD to the attention of the COLT community. The following snippet from J. Hosking, E. Pednault, and M. Sudan (1997) describes the strength of VC theory well:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantifying Generalization in Linearly Weighted Neural Networks

Abst ract . Th e Vapn ik-Chervonenkis dimension has proven to be of great use in the theoret ical study of generalizat ion in artificial neural networks. Th e "probably approximately correct" learning framework is described and the importance of the Vapnik-Chervonenkis dimension is illustrated. We then investigate the Vapnik-Chervonenkis dimension of certain types of linearly weighted neural ne...

متن کامل

Sign rank versus Vapnik-Chervonenkis dimension

This work studies the maximum possible sign rank of sign (N ×N)-matrices with a given Vapnik-Chervonenkis dimension d. For d = 1, this maximum is three. For d = 2, this maximum is Θ̃(N). For d > 2, similar but slightly less accurate statements hold. The lower bounds improve on previous ones by Ben-David et al., and the upper bounds are novel. The lower bounds are obtained by probabilistic constr...

متن کامل

Error Bounds for Real Function Classes Based on Discretized Vapnik-Chervonenkis Dimensions

The Vapnik-Chervonenkis (VC) dimension plays an important role in statistical learning theory. In this paper, we propose the discretized VC dimension obtained by discretizing the range of a real function class. Then, we point out that Sauer’s Lemma is valid for the discretized VC dimension. We group the real function classes having the infinite VC dimension into four categories by using the dis...

متن کامل

VC Dimension of Neural Networks

This paper presents a brief introduction to Vapnik-Chervonenkis (VC) dimension, a quantity which characterizes the difficulty of distribution-independent learning. The paper establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in neural network theory.

متن کامل