Phishing website detection using weighted feature line embedding

نویسندگان

چکیده مقاله:

The aim of phishing is tracing the users' s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. Moreover, among the available training samples, there are abnormal samples that cause classification error. For instance, it is possible that there are phishing samples with similar features to legitimate ones and vice versa. A supervised feature extraction method, called weighted feature line embedding, is proposed in this paper to solve these problems. The proposed method virtually generates training samples by utilizing the feature line metric. Hence, it can solve the small sample size problem. Moreover, by assigning appropriate weights to each pair of feature points, it corrects the undesirable quality of abnormal samples. The features extracted by our method improve the performance of phishing website detection specially by using small training sets.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Image-based Feature Extraction Approach for Phishing Website Detection

Phishing website creators and anti-phishing defenders are in an arms race. Cloning a website is fairly easy and can be automated by any junior programmer. Attempting to recognize numerous phishing links posted in the wild e.g. on social media sites or in email is a constant game of escalation. Automated phishing website detection systems need both speed and accuracy to win. We present a new met...

متن کامل

Iterative Construction of Hierarchical Classifiers for Phishing Website Detection

This article is devoted to a new iterative construction of hierarchical classifiers in SimpleCLI for the detection of phishing websites. Our new construction of hierarchical systems creates ensembles of ensembles in SimpleCLI by iteratively linking a top-level ensemble to another middle-level ensemble instead of a base classifier so that the top-level ensemble can generate a large multilevel sy...

متن کامل

Feature Selection for Improved Phishing Detection

Phishing – a hotbed of multibillion dollar underground economy – has become an important cybersecurity problem. The centralized blacklist approach used by most web browsers usually fails to detect zero-day attacks, leaving the ordinary users vulnerable to new phishing schemes; therefore, learning machine based approaches have been implemented for phishing detection. Many existing techniques in ...

متن کامل

Nearest feature line embedding for face hallucination

A new manifold learning method, called nearest feature line (NFL) embedding, for face hallucination is proposed. While many manifold learning based face hallucination algorithms have been proposed in recent years, most of them apply the conventional nearest neighbour metric to derive the subspace and may not effectively characterise the geometrical information of the samples, especially when th...

متن کامل

Phishing Detection Using Neural Network

The goal of this project is to apply multilayer feedforward neural networks to phishing email detection and evaluate the effectiveness of this approach. We design the feature set, process the phishing dataset, and implement the neural network (NN) systems. We then use cross validation to evaluate the performance of NNs with different numbers of hidden units and activation functions. We also com...

متن کامل

Associative Classification Mining for Website Phishing Classification

-Website phishing is one of the crucial research topics for the internet community due to the massive number of online daily transactions. The process of predicting the phishing activity for a website is a typical classification problem in data mining where different website’s features such as URL length, prefix and suffix, IP address, etc., are used to discover concealed correlations (knowledg...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 9  شماره 2

صفحات  49- 61

تاریخ انتشار 2017-07-31

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023