An Effective Metric Measuring the Degree of Web Page Changes by Reliance Evaluation

نویسندگان

  • Shin Young Kwon
  • Sung Jin Kim
  • Sang Ho Lee
چکیده

A number of document similarity metrics have been used to measure the degree of web page changes. When a web page changes, the metrics often represent the change differently. In this paper, we first define criteria for evaluating the effectiveness of the metrics in terms of six important types of web page changes. The criteria satisfy ISO/IED 15408 (CC). Second, we propose a new document similarity metric appropriate for measuring the degree of web page changes. In our experiment, we evaluate the five existing metrics (i.e., the byte-wise comparison, the TF·IDF cosine distance, the word distance, the edit distance, and the shingling) and ours under the proposed criteria. The experimental result shows that our metric represents the changes more effectively than other metrics. This work was supported by Korea Research Foundation Grant (KRF-2004-005-D00172).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Metrics for Electronic data Representation into the Audio Space

We discuss the representations of the electronic data, especially content of the web pages, into the audio space. We define two metrics usable for comparing two distinct audio representations. First metric is connected with data changes perception by reading users. Changes in the web page that are not important for the reader comparing to the previous version are called small changes on the web...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

AHP Techniques for Trust Evaluation in Semantic Web

The increasing reliance on information gathered from the web and other internet technologies raise the issue of trust. Through the development of semantic Web, One major difficulty is that, by its very nature, the semantic web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each resource. Each user knows the trustworthiness of ...

متن کامل

AHP Techniques for Trust Evaluation in Semantic Web

The increasing reliance on information gathered from the web and other internet technologies raise the issue of trust. Through the development of semantic Web, One major difficulty is that, by its very nature, the semantic web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each resource. Each user knows the trustworthiness of ...

متن کامل

تشخیص ناهنجاری روی وب از طریق ایجاد پروفایل کاربرد دسترسی

Due to increasing in cyber-attacks, the need for web servers attack detection technique has drawn attentions today. Unfortunately, many available security solutions are inefficient in identifying web-based attacks. The main aim of this study is to detect abnormal web navigations based on web usage profiles. In this paper, comparing scrolling behavior of a normal user with an attacker, and simu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005