نتایج جستجو برای: smote

تعداد نتایج: 650  

2014
William A. Rivera Amit Goel Peter Kincaid

Real world data sets often contain disproportionate sample sizes of observed groups making the task of prediction algorithms very difficult. One of the many ways to combat inherit bias from class imbalance data is to perform re-sampling. In this paper we discuss two popular re-sampling approaches proposed in literature, Synthetic Minority Over-sampling Technique (SMOTE) and Propensity Score Mat...

2007
Cristiane Neri Nobre J. Miguel Ortega Antônio de Pádua Braga

An important task in the area of gene discovery is the correct prediction of the translation initiation site (TIS). The TIS can correspond to the first AUG, but this is not always the case. This task can be modeled as a classification problem between positive (TIS) and negative patterns. Here we have used Support Vector Machine working with data processed by the class balancing method called Sm...

2013
Luís Torgo Rita P. Ribeiro Bernhard Pfahringer Paula Branco

Several real world prediction problems involve forecasting rare values of a target variable. When this variable is nominal we have a problem of class imbalance that was already studied thoroughly within machine learning. For regression tasks, where the target variable is continuous, few works exist addressing this type of problem. Still, important application areas involve forecasting rare extr...

Journal: :Neurocomputing 2021

One of the main goals Big Data research, is to find new data mining methods that are able process large amounts in acceptable times. In classification, as traditional class imbalance a common problem must be addressed, case also looking for solution can applied an execution time. this paper we present Approx-SMOTE, parallel implementation SMOTE algorithm Apache Spark framework. The key differen...

2013

In our simulation studies with high-dimensional class-imbalanced data we observed that under the null case SMOTE had hardly any effect on classification with PAM, when all the p = 1000 simulated variables where considered. On the other hand, if only a subset of the variables was used (G = 40), SMOTE seemed beneficial in reducing the class-imbalance problem of PAM, decreasing the number of sampl...

2013
Kung-Jeng Wang Bunjira Makond Kung-Min Wang

BACKGROUND Breast cancer is one of the most critical cancers and is a major cause of cancer death among women. It is essential to know the survivability of the patients in order to ease the decision making process regarding medical treatment and financial preparation. Recently, the breast cancer data sets have been imbalanced (i.e., the number of survival patients outnumbers the number of non-s...

Journal: :Artif. Intell. Research 2017
Chun Gui

Class-imbalanced datasets are common in the field of mobile Internet industry. We tested three kinds of feature selection techniques-Random Forest (RF), Relative Weight (RW) and Standardized Regression Coefficients (SRC); three kinds of balance methods-over-sampling (OS), under-sampling (US) and synthetic minority over-sampling (SMOTE); a widely used classification method-RF. The combined model...

Journal: :IJDATS 2008
Dudyala Anil Kumar Vadlamani Ravi

In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligenc...

2018
Sima Sharifirad Azra Nazari Mehdi Ghatee

SMOTE is one of the oversampling techniques for balancing the datasets and it is considered as a pre-processing step in learning algorithms. In this paper, four new enhanced SMOTE are proposed that include an improved version of KNN in which the attribute weights are defined by mutual information firstly and then they are replaced by maximum entropy, Renyi entropy and Tsallis entropy. These fou...

2017
Rafet Sifa Julian Runge Christian Bauckhage Daniel Klapper

In non-contractual freemium and sharing economy settings, a small share of users often drives the largest part of revenue for firms and co-finances the free provision of the product or service to a large number of users. Successfully retaining and upselling such high-value users can be crucial to firms’ survival. Predictions of customers’ Lifetime Value (LTV) are a much used tool to identify hi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید