Validation of Two-Sample Bootstrap in ROC Analysis on Large Datasets Using AURC

نویسندگان

  • Jin Chu Wu
  • Alvin F. Martin
  • Raghu N. Kacker
چکیده

Sampling variability results in uncertainties of measures. The nonparametric twosample bootstrap method has been used to compute uncertainties of measures in receiver operating characteristic (ROC) analysis on large datasets, such as the standard error (SE) of the equal error rate in biometrics, the SE of a detection cost function in speaker recognition evaluation, and others. Specifically, the SE of the area under ROC curve (AURC) can be computed analytically using the Mann-Whitney statistic. It can also be calculated using the nonparametric two-sample bootstrap method. The analytical result could be treated as a ground truth. The relative errors of bootstrap-method results with respect to the analytical-method results using different matching algorithms were examined, and they were quite small. Hence, this validates the nonparametric two-sample bootstrap method applied in ROC analysis on large datasets. Index Terms -ROC analysis, bootstrap, area under ROC curve, uncertainty, standard error, biometrics, speaker recognition. * Tel: + 301-975-6996; fax: + 301-975-5287. E-mail address: [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Further Studies of Bootstrap Variability for ROC Analysis on Large Datasets

The nonparametric two-sample bootstrap is successfully applied to computing the measurement uncertainties in receiver operating characteristic (ROC) analysis on large datasets in areas such as biometrics, speaker recognition system, etc. To determine the number of bootstrap replications in our applications, the bootstrap variability related to standard error and two bounds of 95 % confidence in...

متن کامل

Measures, Uncertainties, and Significance Test in Operational ROC Analysis

In receiver operating characteristic (ROC) analysis, the sampling variability can result in uncertainties of performance measures. Thus, while evaluating and comparing the performances of algorithms, the measurement uncertainties must be taken into account. The key issue is how to calculate the uncertainties of performance measures in ROC analysis. Our ultimate goal is to perform the significan...

متن کامل

The Effect of Observation Data Sampling Methods on Infiltration Areas by Maximum Entropy Model

Statistical modeling methods are based on multivariate regression methods and require the presence and absence location of data for the construction of the model. In most cases, there is no trustworthy absence data. Therefore, other methods that are based only on the presence of the phenomenon are used. Considering the importance of modeling - saving time and cost and the probable prediction of...

متن کامل

Studies of Operational Measurement of ROC Curve on Large Fingerprint Data Sets Using Two-Sample Bootstrap Studies of Operational Measurement of ROC Curve on Large Fingerprint Data Sets Using Two-Sample Bootstrap

From the operational perspective, on large fingerprint data sets, a receiver operating characteristic (ROC) curve is usually measured by the true accept rate (TAR) of the genuine scores given a specified false accept rate (FAR) of the impostor scores. The ties of genuine and/or impostor scores at a threshold can often occur on large fingerprint data sets, and how to determine the TAR at an oper...

متن کامل

Hypothesis Test of Fingerprint-Image Matching Algorithms in Operational ROC Analysis

To evaluate the performance of fingerprint-image matching algorithms on large datasets, a receiver operating characteristic (ROC) curve is applied. From the operational perspective, the true accept rate (TAR) of the genuine scores at a specified false accept rate (FAR) of the impostor scores is usually employed. And the equal error rate (EER) can also be used. The accuracies of the measurement ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011