Implementasi Adasyn Untuk Imbalance Data Pada Dataset UNSW-NB15 Adasyn Implementation For Data Imbalance on UNSW-NB15 Dataset

نویسندگان

چکیده

Di masa Machine Learning pada saat ini, para peneliti bekerja keras untuk mengembangkan algoritma yang meningkatkan kemungkinan prediksi benar dengan akurasi lebih baik. Data tidak seimbang adalah ketika ukuran sampel dari satu kelas jauh besar lain, minoritas dapat diperlakukan sebagai noise dalam proses klasifikasi, mengakibatkan hasil klasifikasi memuaskan. Pada penelitian ini menggunakan dataset UNSW-NB15, setelah menggabungkan data train dan test, terdapat seimbangan label, yaitu 164673 label 1 93000 0. Tujuan mengatasi masalah ketidakseimbangan binary class teknik ADASYN mendeteksi serangan malware UNSW-NB15 menerapkan model Random Forest agar mendapatkan performa cukup Berdasarkan pengujian penanganan Binarry Class Forest, serta Hyperparameter Optuna Anomali memperoleh beberapa split nilai tertinggi 90/10 99.86%. segi waktu tercepat didapat 60/40 1,85 seconds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Two-Stage Classifier Approach using RepTree Algorithm for Network Intrusion Detection

In this paper, we present a two-stage classifier based on RepTree algorithm and protocols subset for network intrusion detection system. To evaluate the performance of our approach, we used the UNSW-NB15 data set and the NSL-KDD data set. In first phase our approach divides the incoming network traffics into three type of protocols TCP, UDP or Other, then classifies into normal or anomaly. In s...

متن کامل

A Survey on Methods to Handle Imbalance Dataset

Imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of unbalanced data sets. To handle the problem of imbalanced data is to re balance them artificially by oversampling and/or under-sampling.

متن کامل

UNSW at GeoCLEF 2006

This paper describes our participation in the GeoCLEF monolingual English task of the Cross Language Evaluation Forum 2006. Our retrieval system consists of four modules: the geographic knowledge base; the indexing module; the document retrieval module and the ranking module. The geographic knowledge base provides information about important geographic entities around the world and relationship...

متن کامل

Handling class imbalance problem in miRNA dataset associated with cancer

MiRNAs are small (~22nt long) non-coding RNA sequences; binds to the complementarity target sites in 3' Untranslated Region (UTR) of mRNA sequences but not restricted to other mRNA regions viz., 5' UTR and Coding sequences (CDS). Complementarity binding of miRNA to mRNA target sites either results in complete degradation of the mRNA itself or it may regulate the mRNA as an oncogene or as a tumo...

متن کامل

Aquarius Firmware for UNSW Namuru GPS Receivers

UNSW is well known for its work in the development of FPGA based GPS receivers. However, a hardware platform without suitable firmware severely limits the application of that hardware and for the Namuru, the only available firmware has been the GPS Architect and the GPL-GPS. Unfortunately, both of these firmware suites have their limitations and it is for this reason that the Aquarius firmware ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Jurnal Computer Science and Information Technology

سال: 2022

ISSN: ['2723-567X', '2723-5661']

DOI: https://doi.org/10.37859/coscitech.v3i3.4339