An Efficient Two-phase Spam Filtering Method Based on E-mails Categorization

نویسنده

  • Jyh-Jian Sheu
چکیده

The e-mail’s header session usually contains important attributes such as e-mail title, sender’s name, sender’s email address, sending date, which are helpful to classification of e-mails. In this paper, we apply decision tree data mining technique to header’s basic attributes to analyze the association rules of spam e-mails and propose an efficient spam filtering method to accurately identify spam and legitimate e-mails. According to the experiment of applying numerous Chinese e-mails to our spam filtering method, we obtain the following excellent datums: the Accuracy is 96.5%, the Precision is 96.67%, and the Recall is 96.3%. Thus, the method proposed in this paper can efficiently identify the spam e-mails by checking only the header sessions, which can reduce the cost for calculation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spam Filtering Based On The Analysis Of Text Information Embedded Into Images

In recent years anti-spam filters have become necessary tools for Internet service providers to face up to the continuously growing spam phenomenon. Current server-side anti-spam filters are made up of several modules aimed at detecting different features of spam e-mails. In particular, text categorisation techniques have been investigated by researchers for the design of modules for the analys...

متن کامل

Image spam filtering using textual and visual information

In this paper we focus on the so-called image spam, which consists in embedding the spam message into images attached to e-mails to circumvent statistical techniques based on the analysis of body text of e-mails (like the “bayesian filters”), and in applying content obscuring techniques to such images to make them unreadable by standard OCR systems without compromising human readability. We arg...

متن کامل

Filtering Spam by Using Factors Hyperbolic Trees

Most of current Anti-spam techniques, like the Bayesian anti-spam algorithm, primarily use lexical matching for filtering unsolicited bulk E-mails (UBE) and unsolicited commercial E-mails (UCE). However, precision of spam filtering is usually low when the lexical matching algorithms are used in real dynamic environments. For example, an E-mail of refrigerator advertisements is useful for most f...

متن کامل

Using cellular automata for improving knn based spam filtering

As rapid growth over the Internet nowadays, electronic mail (e-mails) has become a popular communication tool. However, junk mail also, known as spam has increasingly become a part of life for users as well as internet service providers. To address this problem, many solutions have been proposed in the last decade. Currently, content-based anti-spam filtering methods are an important issue; the...

متن کامل

Personalized E-mail Filtering System Based on Usage Control

In order to cope with the problem of spam soaring, a personalized e-mail filtering method based on UCON is proposed. E-mails from different senders were classified as junk e-mail, suspicious e-mail and normal email by trust third-party according to the maintained blacklist and embedded machine learning technology online. Suspicious e-mails will be classified further from users’ point of view ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Network Security

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2009