Large Margin Deep Networks for Classification
نویسندگان
چکیده
We present a formulation of deep learning that aims at producing a large margin classifier. The notion of margin, minimum distance to a decision boundary, has served as the foundation of several theoretically profound and empirically successful results for both classification and regression tasks. However, most large margin algorithms are applicable only to shallow models with a preset feature representation; and conventional margin methods for neural networks only enforce margin at the output layer. Such methods are therefore not well suited for deep networks. In this work, we propose a novel loss function to impose a margin on any chosen set of layers of a deep network (including input and hidden layers). Our formulation allows choosing any lp norm (p ≥ 1) on the metric measuring the margin. We demonstrate that the decision boundary obtained by our loss has nice properties compared to standard classification loss functions. Specifically, we show improved empirical results on the MNIST, CIFAR-10 and ImageNet datasets on multiple tasks: generalization from small training sets, corrupted labels, and robustness against adversarial perturbations. The resulting loss is general and complementary to existing data augmentation (such as random/adversarial input transform) and regularization techniques (such as weight decay, dropout, and batch norm).
منابع مشابه
Large-Margin Classification in Infinite Neural Networks
We introduce a new family of positive-definite kernels for large margin classification in support vector machines (SVMs). These kernels mimic the computation in large neural networks with one layer of hidden units. We also show how to derive new kernels, by recursive composition, that may be viewed as mapping their inputs through a series of nonlinear feature spaces. These recursively derived k...
متن کاملLarge Margin Deep Neural Networks: Theory and Algorithms
Deep neural networks (DNN) have achieved huge practical success in recent years. However, its theoretical properties (in particular generalization ability) are not yet very clear, since existing error bounds for neural networks cannot be directly used to explain the statistical behaviors of practically adopted DNN models (which are multi-class in their nature and may contain convolutional layer...
متن کاملSoft-Margin Softmax for Deep Classification
In deep classification, the softmax loss (Softmax) is arguably one of the most commonly used components to train deep convolutional neural networks (CNNs). However, such a widely used loss is limited due to its lack of encouraging the discriminability of features. Recently, the large-margin softmax loss (L-Softmax [14]) is proposed to explicitly enhance the feature discrimination, with hard mar...
متن کاملTheory of Deep Learning III: the non-overfitting puzzle
A main puzzle of deep networks revolves around the apparent absence of overfitting intended as robustness of the expected error against overparametrization, despite the large capacity demonstrated by zero training error on randomly labeled data. In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asympto...
متن کاملPorosity classification from thin sections using image analysis and neural networks including shallow and deep learning in Jahrum formation
The porosity within a reservoir rock is a basic parameter for the reservoir characterization. The present paper introduces two intelligent models for identification of the porosity types using image analysis. For this aim, firstly, thirteen geometrical parameters of pores of each image were extracted using the image analysis techniques. The extracted features and their corresponding pore types ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018