Decomposing the Web Graph into Parameterized Connected Components
نویسندگان
چکیده
We propose a novel method for Web page grouping based only on hyperlink information. Because of the explosive growth of the World Wide Web, page grouping is expected to provide a general grasp of the Web for effective information search and netsurfing. The Web can be regarded as a gigantic digraph where pages are vertices and links are arcs. Our grouping method is a generalization of decomposition into strongly connected components, in which each group is constructed as a subset of a strongly connected component. Moreover, group sizes can be controlled by adjusting a parameter, called the threshold parameter. We call the resulting groups parameterized connected components (PCCs). The algorithm is simple and admits parallelization. Notably, we apply Dijkstra’s shortest path algorithm in our grouping method. This paper also includes experimental results for 15 million Web pages, which show the contribution of our method to efficient Web surfer navigation. key words: link analysis, Web mining, strongly connected component
منابع مشابه
GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components
This paper presents parallel algorithms for component decomposition of graph structures on General Purpose Graphics Processing Units (GPUs). In particular, we consider the problem of decomposing sparse graphs into strongly connected components, and decomposing stochastic games (such as Markov decision processes) into maximal end components. These problems are key ingredients of many (probabilis...
متن کاملDecomposing Infinite 2-Connected Graphs into 3-Connected Components
In the 1960’s, Tutte presented a decomposition of a 2-connected finite graph into 3-connected graphs, cycles and bonds. This decomposition has been used to reduce problems on 2-connected graphs to problems on 3-connected graphs. Motivated by a problem concerning accumulation points of infinite planar graphs, we generalize Tutte’s decomposition to include all infinite 2-connected graphs.
متن کاملOn the existence of a connected component of a graph
We study the reverse mathematics and computability of countable graph theory, obtaining the following results. The principle that every countable graph has a connected component is equivalent to ACA0 over RCA0. The problem of decomposing a countable graph into connected components is strongly Weihrauch equivalent to the problem of finding a single component, and each is equivalent to its infini...
متن کاملEfficient GPU algorithms for parallel decomposition of graphs into strongly connected and maximal end components
This article presents parallel algorithms for component decomposition of graph structures on general purpose graphics processing units (GPUs). In particular, we consider the problem of decomposing sparse graphs into strongly connected components, and decomposing graphs induced by stochastic games (such as Markov decision processes) into maximal end components. These problems are key ingredients...
متن کاملParameterized Complexity and Approximation Issues for the Colorful Components Problems
The quest for colorful components (connected components where each color is associated with at most one vertex) inside a vertex-colored graph has been widely considered in the last ten years. Here we consider two variants, Minimum Colorful Components (MCC) and Maximum Edges in transitive Closure (MEC), introduced in the context of orthology gene identification in bioinformatics. The input of bo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEICE Transactions
دوره 87-D شماره
صفحات -
تاریخ انتشار 2004