Asymptotic properties of the misclassification rates for Euclidean Distance Discriminant rule in high-dimensional data
نویسندگان
چکیده
Performance accuracy of the Euclidean Distance Discriminant rule (EDDR) is studied in the high-dimensional asymptotic framework which allows the dimensionality to exceed sample size. Under mild assumptions on the traces of the covariance matrix, our new results provide the asymptotic distribution of the conditional misclassification error and the explicit expression for the consistent and asymptotically unbiased estimator of the expected misclassification error. To get these properties, new results on the asymptotic normality of the quadratic forms and traces of the higher power of Wishart matrix, are established. Using our asymptotic results, we further develop two generic methods of determining a cut-off point for EDDR to adjust the misclassification errors. Finally, we numerically justify the high accuracy of our asymptotic findings along with the cut-off determination methods in finite sample applications, inclusive of the large sample and high-dimensional scenarios.
منابع مشابه
A Variable Selection Criterion for Linear Discriminant Rule and its Optimality in High Dimensional Setting
In this paper, we suggest the new variable selection procedure, called MEC, for linear discriminant rule in the high-dimensional setup. MEC is derived as a second-order unbiased estimator of the misclassification error probability of the linear discriminant rule. It is shown that MEC not only decomposes into ‘fitting’ and ‘penalty’ terms like AIC and Mallows Cp, but also possesses an asymptotic...
متن کاملA ROAD to Classification in High Dimensional Space.
For high-dimensional classification, it is well known that naively performing the Fisher discriminant rule leads to poor results due to diverging spectra and noise accumulation. Therefore, researchers proposed independence rules to circumvent the diverging spectra, and sparse independence rules to mitigate the issue of noise accumulation. However, in biological applications, there are often a g...
متن کامل2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملSparse Quadratic Discriminant Analysis For High Dimensional Data
Many contemporary studies involve the classification of a subject into two classes based on n observations of the p variables associated with the subject. Under the assumption that the variables are normally distributed, the well-known linear discriminant analysis (LDA) assumes a common covariance matrix over the two classes while the quadratic discriminant analysis (QDA) allows different covar...
متن کاملAnalysis of the consistency of a mixed integer programming-based multi-category constrained discriminant model
Classification is concerned with the development of rules for the allocation of observations to groups, and is a fundamental problem in machine learning. Much of previous work on classification models investigates two-group discrimination. Multi-category classification is less-often considered due to the tendency of generalizations of two-group models to produce misclassification rates that are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Multivariate Analysis
دوره 140 شماره
صفحات -
تاریخ انتشار 2015