Smote vs. Random Undersampling for Imbalanced Data - Car Ownership Demand Model
نویسندگان
چکیده
Because the numbers of cars reflect each person's travel behaviors for specific location, car ownership demand model plays a dominant role in analysis order to understand area's individual and household behaviors. However, study project master plan Khon Kaen expressway represented imbalanced data; namely, majority class minority were not equal. Before developing machine learning model, this suggested solution balance data by using oversampling under-sampling techniques. The data, which had been improved with SMOTE (Synthetic Minority Oversampling Technique) kNN (k-nearest neighbors) (k = 5), demonstrated better effect than other algorithms that studied. TPR (true positive rate) rural suburban areas, are types regions very different imbalance ratios, was calculated before balancing at 46.9 % 46.4 %. As result, values 63.5 54.4 %, respectively, following balancing.
منابع مشابه
Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling
In the classification framework there are problems in which the number of examples per class is not equitably distributed, formerly known as imbalanced data sets. This situation is a handicap when trying to identify the minority classes, as the learning algorithms are not usually adapted to such characteristics. An usual approach to deal with the problem of imbalanced data sets is the use of a ...
متن کاملConversion of Imbalanced Data Into A Stream Using SMOTE Algorithm
Machine learning approach has got major importance when distribution of data is unknown. Classification of data from the data set causes some problem when distribution of data is unknown. Characterization of raw data relates to whether the data can take on only discrete values or whether the data is continuous. In real world application data drawn from non-stationary distribution, causes the pr...
متن کاملComputer-Aided Lung Nodule Recognition by SVM Classifier Based on Combination of Random Undersampling and SMOTE
In lung cancer computer-aided detection/diagnosis (CAD) systems, classification of regions of interest (ROI) is often used to detect/diagnose lung nodule accurately. However, problems of unbalanced datasets often have detrimental effects on the performance of classification. In this paper, both minority and majority classes are resampled to increase the generalization ability. We propose a nove...
متن کاملOversampling for Imbalanced Learning Based on K-Means and SMOTE
Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification a...
متن کاملUncalibrated Distortions vs Undersampling
In a recent paper of ours [Hess & Field (1993). Vision Research, 33, 2663-2670], we claim that there was a predictable relationship between position errors and contrast errors for an undersampled system. In this paper we re-state our main points. We feel that the response to that paper by Levi and Klein in the accompanying article does not require us to produce changes in our original position....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Komunikácie
سال: 2022
ISSN: ['1335-4205']
DOI: https://doi.org/10.26552/com.c.2022.3.d105-d115