cross validation error

Model selection for linear classifiers using Bayesian error estimation

Journal: :Pattern Recognition 2015

Heikki Huttunen Jussi Tohka

Regularized linear models are important classification methods for high dimensional problems, where regularized linear classifiers are often preferred due to their ability to avoid overfitting. The degree of freedom of the model is determined by a regularization parameter, which is typically selected using counting based approaches, such as K-fold cross-validation. For large data, this can be v...

متن کامل

An Empirical Study of Univariate and Genetic Algorithm-Based Feature Selection in Binary Classification with Microarray Data

Journal: :Cancer Informatics 2006

Michael Lecocke Kenneth Hess

BACKGROUND We consider both univariate- and multivariate-based feature selection for the problem of binary classification with microarray data. The idea is to determine whether the more sophisticated multivariate approach leads to better misclassification error rates because of the potential to consider jointly significant subsets of genes (but without overfitting the data). METHODS We presen...

متن کامل

Estimating Generalization Error Using Out-of-Bag Estimates

1999

Tom Bylander Dennis Hanzlik

We provide a method for estimating the generalization error of a bag using out-of-bag estimates. In bagging, each predictor (single hypothesis) is learned from a bootstrap sample of the training examples; the output of a bag (a set of predictors) on an example is determined by voting. The outof-bag estimate is based on recording the votes of each predictor on those training examples omitted fro...

متن کامل

Estimating and Reducing the Error of a Classifier or Predictor

2007

K. Ming Leung

Methods, such as holdout, random subsampling, k-fold cross-validation, and bootstrap, for making error estimation are discussed. Also considered are general techniques, such as bagging and boosting, for increasing model accuracy. Directory • Table of

متن کامل

A Semi-Analytic Model for Estimating Total Suspended Sediment Concentration in Turbid Coastal Waters of Northern Western Australia Using MODIS-Aqua 250 m Data

Journal: :Remote Sensing 2016

Passang Dorji Peter Fearns Mark Broomhall

Knowledge of the concentration of total suspended sediment (TSS) in coastal waters is of significance to marine environmental monitoring agencies to determine the turbidity of water that serve as a proxy to estimate the availability of light at depth for benthic habitats. TSS models applicable to data collected by satellite sensors can be used to determine TSS with reasonable accuracy and of ad...

متن کامل

Detection of schizophrenia patients using convolutional neural networks from brain effective connectivity maps of electroencephalogram signals

Journal: مجله دانشکده پزشکی دانشگاه علوم پزشکی تهران 2022

Ahmad Shalbaf, Arash Maghsoudi, Sara Bagherzadeh,

Background: Schizophrenia is a mental disorder that severely affects the perception and relations of individuals. Nowadays, this disease is diagnosed by psychiatrists based on psychiatric tests, which is highly dependent on their experience and knowledge. This study aimed to design a fully automated framework for the diagnosis of schizophrenia from electroencephalogram signals using advanced de...

متن کامل

Development and validation of skinfold-thickness prediction equations with a 4-compartment model1–3

2003

Matthew J Peterson Stefan A Czerwinski Roger M Siervogel

Background: Skinfold-thickness measurements are commonly obtained for the indirect assessment of body composition. Objective: We developed new skinfold-thickness equations by using a 4-compartment model as the reference. Additionally, we compared our new equations with the Durnin and Womersley and Jackson and Pollock skinfold-thickness equations to evaluate each equation’s validity and precisio...

متن کامل

Variable data driven bandwidth choice in nonparametric quantile regression

2002

Klaus Abberger

The choice of a smoothing parameter or bandwidth is crucial when applying nonparametric regression estimators. In nonparametric mean regression various methods for bandwidth selection exists. But in nonparametric quantile regression bandwidth choice is still an unsolved problem. In this paper a selection procedure for local varying bandwidths based on the asymptotic mean squared error (MSE) of ...

متن کامل

Subspace Information Criterion for Sparse Regressors

2001

Koji Tsuda Masashi Sugiyama

Non-quadratic regularizers, in particular the ` 1 norm regularizer can yield sparse solutions that generalize well. In this work we propose the Generalized Subspace Information Criterion (GSIC) that allows to predict the generalization error for this useful family of regularizers. We show that under some technical assumptions GSIC is an asymptotically unbiased estimator of the generalization er...

متن کامل

An open-set detection evaluation methodology applied to language and emotion recognition

2007

David A. van Leeuwen Khiet P. Truong

This paper introduces a detection methodology for recognition technologies in speech for which it is difficult to obtain an abundance of non-target classes. An example is language recognition, where we would like to be able to measure the detection capability of a single target language without confounding with the modeling capability of non-target languages. The evaluation framework is based o...

متن کامل