SMOTE for high-dimensional class-imbalanced data
نویسندگان
چکیده
منابع مشابه
Class-imbalanced classifiers for high-dimensional data
A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A class-imbalanced classifier typically modifies a ...
متن کاملSoftware Defect Prediction for High-Dimensional and Class-Imbalanced Data
Software quality and reliability can be improved using various techniques during the software development process. One effective method is to utilize software metrics and defect data collected during the software development life cycle and build defect predictors using data mining techniques to estimate the quality of target program modules. Such a strategy allows practitioners to intelligently...
متن کاملConversion of Imbalanced Data Into A Stream Using SMOTE Algorithm
Machine learning approach has got major importance when distribution of data is unknown. Classification of data from the data set causes some problem when distribution of data is unknown. Characterization of raw data relates to whether the data can take on only discrete values or whether the data is continuous. In real world application data drawn from non-stationary distribution, causes the pr...
متن کاملFeature selection for high-dimensional class-imbalanced data sets using Support Vector Machines
Feature selection and classification of imbalanced data sets are two of the most interesting machine learning challenges, attracting a growing attention from both, industry and academia. Feature selection addresses the dimensionality reduction problem by determining a subset of available features to build a good model for classification or prediction, while the class-imbalance problem arises wh...
متن کاملPossible explanation on the effect of variable selection on PAM used with SMOTE In our simulation studies with high-dimensional class-imbalanced data
In our simulation studies with high-dimensional class-imbalanced data we observed that under the null case SMOTE had hardly any effect on classification with PAM, when all the p = 1000 simulated variables where considered. On the other hand, if only a subset of the variables was used (G = 40), SMOTE seemed beneficial in reducing the class-imbalance problem of PAM, decreasing the number of sampl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2013
ISSN: 1471-2105
DOI: 10.1186/1471-2105-14-106