Balancing Between Bagging and Bumping
نویسنده
چکیده
We compare different methods to combine predictions from neural networks trained on different bootstrap samples of a regression problem. One of these methods, introduced in [6] and which we here call balancing, is based on the analysis of the ensemble generalization error into an ambiguity term and a term incorporating generalization performances of individual networks. We show how to estimate these individual errors from the residuals on validation patterns. Weighting factors for the different networks follow from a quadratic programming problem. On a real-world problem concerning the prediction of sales figures and on the well-known Boston housing data set, balancing clearly outperforms other recently proposed alternatives as bagging [1] and bumping [8]. 1 EARLY STOPPING AND BOOTSTRAPPING Stopped training is a popular strategy to prevent overfitting in neural networks. The complete data set is split up into a training and a validation set . Through learning the weights are adapted in order to minimize the error on the training data. Training is stopped when the error on the validation data starts increasing. The final network depends on the accidental subdivision in training and validation set , and often also on the, usually random, initial weight configuration and chosen minimization procedure. In other words , early stopped neural networks are highly unstable: small changes in the data or different initial conditions can produce large changes in the estimate. As argued in [1 , 8], with unstable estimators it is advisable to resample, i.e ., to apply the same procedure several times using different subdivisions in training and validation set and perhaps starting from different initial RWCP: Real World Computing Partnership; SNN: Foundation for Neural Networks. Balancing Between Bagging and Bumping 467 configurations. In the neural network literature resampling is often referred to as training ensembles of neural networks [3, 6]. In this paper, we will discuss methods for combining the outputs of networks obtained through such a repetitive procedure. First, however, we have to choose how to generate the subdivisions in training and validation sets. Options are, among others, k-fold cross-validation, subsampling and bootstrapping. In this paper we will consider bootstrapping [2] which is based on the idea that the available data set is nothing but a particular realization of some probability distribution. In principle, one would like to do inference on this "true" yet unknown probability distribution. A natural thing to do is then to define an empirical distribution. With so-called naive bootstrapping the empirical distribution is a sum of delta peaks on the available data points, each with probability content l/Pdata with Pdata the number of patterns. A bootstrap sample is a collection of Pdata patterns drawn with replacement from this empirical probability distribution. Some of the data points will occur once, some twice and some even more than twice in this bootstrap sample. The bootstrap sample is taken to be the training set, all patterns that do not occur in a particular bootstrap sample constitute the validation set. For large Pdata, the probability that a pattern becomes part of the validation set is (1 l/Pdata)Pda.ta. ~ l/e ~ 0.368. An advantage of bootstrapping over other resampling techniques is that most statistical theory on resampling is nowadays based on the bootstrap. Using naive bootstrapping we generate nrun training and validation sets out of our complete data set of Pdata input-output combinations {iI', tl'}. In this paper we will restrict ourselves to regression problems with, for notational convenience, just one output variable. We keep track of a matrix with components q; indicating whether pattern p is part of the validation set for run i (q; = 1) or of the training set (qf = 0). On each subdivision we train and stop a neural network with one layer of nhidden hidden units. The output or of network i with weight vector w( i) on input il' reads
منابع مشابه
Bagging and Bumping Self Organising Maps
In this paper, we apply the combination method of bagging which has been developed in the context of supervised learning of classifiers and regressors to the unsupervised artificial neural network known as the Self Organising Map. We show that various initialisation techniques can be used to create maps which are comparable by humans by eye. We then use a semi-supervised version of the SOM to c...
متن کاملInvestigating the Effect of Underlying Fabric on the Bagging Behaviour of Denim Fabrics (RESEARCH NOTE)
Underlying fabrics can change the appearance, function and quality of the garment, and also add so much longevity of the garment. Nowadays, with the increasing use of various types of fabrics in the garment industry, their resistance to bagging is of great importance with the aim of determining the effectiveness of textiles under various forces. The current paper investigated the effect of unde...
متن کاملPerformance of Porous Pavement Containing Different Types of Pozzolans
Underlying fabrics can change the appearance, function and quality of the garment, and also add so much longevity of the garment. Nowadays, with the increasing use of various types of fabrics in the garment industry, their resistance to bagging is of great importance with the aim of determining the effectiveness of textiles under various forces. The current paper investigated the effect of unde...
متن کاملBumping Windows between Monitors
Users’ move from single to multiple monitors so that they can use more screen real estate. This increase enables them to keep a greater number of windows open and visible at the same time. But there is a cost: arranging a window takes far longer since there is more screen space to traverse and more relationships between windows to take into account. To address this we added ‘bumping’ to an appl...
متن کاملA Robust Utility Learning Framework via Inverse Optimization
In many smart infrastructure applications, flexibility in achieving sustainability goals can be gained by engaging end users. However, these users often have heterogeneous preferences that are unknown to the decision maker tasked with improving operational efficiency. Modeling user interaction as a continuous game between noncooperative players, we propose a robust parametric utility learning f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996