Mining WWW Access Sequence by Matrix Clustering

نویسندگان

  • Shigeru Oyanagi
  • Kazuto Kubota
  • Akihiko Nakase
چکیده

Sequence pattern mining is one of the most important methods for mining WWW access log. The Apriori algorithm is well known as a typical algorithm for sequence pattern mining. However, it suffers from inherent difficulties in finding long sequential patterns and in extracting interesting patterns among a huge amount of results. This article proposes a new method for finding generalized sequence pattern by matrix clustering. This method decomposes a sequence into a set of sequence elements, each of which corresponds to an ordered pair of items. Then matrix clustering is applied to extract a cluster of similar sequences. The resulting sequence elements are composed into a generalized sequence. Our method is evaluated with practical WWW access log, which shows that it is practically useful in finding long sequences and in presenting the generalized sequence in a graph.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Matrix Clustering to Web Log Analysis and Access Prediction

Matrix clustering is a new data mining method which extracts a dense sub-matrix from a large sparse binary matrix. We propose an e cient algorithm named the ping-pong algorithm which enables real-time mining of a large sparse matrix. This article describes the application of matrix clustering to Web usage mining. Matrix clustering can be applied to Web access log analysis by representing relati...

متن کامل

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming

With the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge to the end users. Every request of Web site or a transaction on the server is stored in a file called server log file. Providing Web administrator with meaningful information about user access behavior (also called click stream data) has become a necessity to improve the quality of Web infor...

متن کامل

Web Page Access Prediction Using Fuzzy Clustering by Local Approximation Memberships (flame) Algorithm

Web page prediction is a technique of web usage mining used to predict the next set of web pages that a user may visit based on the knowledge of previously visited web pages. The World Wide Web (WWW) is a popular and interactive medium for publishing the information. While browsing the web, users are visiting many unwanted pages instead of targeted page. The web usage mining techniques are used...

متن کامل

SOM Improved Neural Network Approach for Next Page Prediction

The increasing usage of web results the heavy communication and slow returns from web. Because of this, there is the requirement of some approaches to optimize the web resources usage. One of such approach is caching that can be used within an organization to optimize the access of frequently used web pages. Caching is about to predict the requirement of next web access of a user and load it in...

متن کامل

Effective Image Mining by Representing Color Histograms as Time Series

Due to the wide spread of digital libraries, digital cameras, and the increase access to WWW by individuals, the number of digital images that exist pose a great challenge. Easy access to such collections requires an index structure to facilitate random access to individual images and ease navigation of these images. As these images are not annotated or associated with descriptions, existing sy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002