Decomposing the Web Graph into Parameterized Connected Components

نویسندگان

  • Tomonari Masada
  • Atsuhiro Takasu
  • Jun Adachi
چکیده

We propose a novel method for Web page grouping based only on hyperlink information. Because of the explosive growth of the World Wide Web, page grouping is expected to provide a general grasp of the Web for effective information search and netsurfing. The Web can be regarded as a gigantic digraph where pages are vertices and links are arcs. Our grouping method is a generalization of decomposition into strongly connected components, in which each group is constructed as a subset of a strongly connected component. Moreover, group sizes can be controlled by adjusting a parameter, called the threshold parameter. We call the resulting groups parameterized connected components (PCCs). The algorithm is simple and admits parallelization. Notably, we apply Dijkstra’s shortest path algorithm in our grouping method. This paper also includes experimental results for 15 million Web pages, which show the contribution of our method to efficient Web surfer navigation. key words: link analysis, Web mining, strongly connected component

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components

This paper presents parallel algorithms for component decomposition of graph structures on General Purpose Graphics Processing Units (GPUs). In particular, we consider the problem of decomposing sparse graphs into strongly connected components, and decomposing stochastic games (such as Markov decision processes) into maximal end components. These problems are key ingredients of many (probabilis...

متن کامل

Decomposing Infinite 2-Connected Graphs into 3-Connected Components

In the 1960’s, Tutte presented a decomposition of a 2-connected finite graph into 3-connected graphs, cycles and bonds. This decomposition has been used to reduce problems on 2-connected graphs to problems on 3-connected graphs. Motivated by a problem concerning accumulation points of infinite planar graphs, we generalize Tutte’s decomposition to include all infinite 2-connected graphs.

متن کامل

On the existence of a connected component of a graph

We study the reverse mathematics and computability of countable graph theory, obtaining the following results. The principle that every countable graph has a connected component is equivalent to ACA0 over RCA0. The problem of decomposing a countable graph into connected components is strongly Weihrauch equivalent to the problem of finding a single component, and each is equivalent to its infini...

متن کامل

Efficient GPU algorithms for parallel decomposition of graphs into strongly connected and maximal end components

This article presents parallel algorithms for component decomposition of graph structures on general purpose graphics processing units (GPUs). In particular, we consider the problem of decomposing sparse graphs into strongly connected components, and decomposing graphs induced by stochastic games (such as Markov decision processes) into maximal end components. These problems are key ingredients...

متن کامل

Parameterized Complexity and Approximation Issues for the Colorful Components Problems

The quest for colorful components (connected components where each color is associated with at most one vertex) inside a vertex-colored graph has been widely considered in the last ten years. Here we consider two variants, Minimum Colorful Components (MCC) and Maximum Edges in transitive Closure (MEC), introduced in the context of orthology gene identification in bioinformatics. The input of bo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEICE Transactions

دوره 87-D  شماره 

صفحات  -

تاریخ انتشار 2004