An Efficient Two-phase Spam Filtering Method Based on E-mails Categorization
نویسنده
چکیده
The e-mail’s header session usually contains important attributes such as e-mail title, sender’s name, sender’s email address, sending date, which are helpful to classification of e-mails. In this paper, we apply decision tree data mining technique to header’s basic attributes to analyze the association rules of spam e-mails and propose an efficient spam filtering method to accurately identify spam and legitimate e-mails. According to the experiment of applying numerous Chinese e-mails to our spam filtering method, we obtain the following excellent datums: the Accuracy is 96.5%, the Precision is 96.67%, and the Recall is 96.3%. Thus, the method proposed in this paper can efficiently identify the spam e-mails by checking only the header sessions, which can reduce the cost for calculation.
منابع مشابه
Spam Filtering Based On The Analysis Of Text Information Embedded Into Images
In recent years anti-spam filters have become necessary tools for Internet service providers to face up to the continuously growing spam phenomenon. Current server-side anti-spam filters are made up of several modules aimed at detecting different features of spam e-mails. In particular, text categorisation techniques have been investigated by researchers for the design of modules for the analys...
متن کاملImage spam filtering using textual and visual information
In this paper we focus on the so-called image spam, which consists in embedding the spam message into images attached to e-mails to circumvent statistical techniques based on the analysis of body text of e-mails (like the “bayesian filters”), and in applying content obscuring techniques to such images to make them unreadable by standard OCR systems without compromising human readability. We arg...
متن کاملFiltering Spam by Using Factors Hyperbolic Trees
Most of current Anti-spam techniques, like the Bayesian anti-spam algorithm, primarily use lexical matching for filtering unsolicited bulk E-mails (UBE) and unsolicited commercial E-mails (UCE). However, precision of spam filtering is usually low when the lexical matching algorithms are used in real dynamic environments. For example, an E-mail of refrigerator advertisements is useful for most f...
متن کاملUsing cellular automata for improving knn based spam filtering
As rapid growth over the Internet nowadays, electronic mail (e-mails) has become a popular communication tool. However, junk mail also, known as spam has increasingly become a part of life for users as well as internet service providers. To address this problem, many solutions have been proposed in the last decade. Currently, content-based anti-spam filtering methods are an important issue; the...
متن کاملPersonalized E-mail Filtering System Based on Usage Control
In order to cope with the problem of spam soaring, a personalized e-mail filtering method based on UCON is proposed. E-mails from different senders were classified as junk e-mail, suspicious e-mail and normal email by trust third-party according to the maintained blacklist and embedded machine learning technology online. Suspicious e-mails will be classified further from users’ point of view ma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- I. J. Network Security
دوره 9 شماره
صفحات -
تاریخ انتشار 2009