The effectiveness of query-specific hierarchic clustering in information retrieval
نویسندگان
چکیده
Hierarchic document clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. However, previous research has been inconclusive as to whether clustering does bring improvements. In this paper we take the view that if hierarchic clustering is applied to search results (query-specific clustering), then it has the potential to increase the retrieval effectiveness compared both to that of static clustering and of conventional inverted file search. We conducted a number of experiments using five document collections and four hierarchic clustering methods. Our results show that the effectiveness of query-specific clustering is indeed higher, and suggest that there is scope for its application to IR.
منابع مشابه
The effectiveness of query-based hierarchic clustering of documents for information retrieval
................................................................................................................................................... i ACKNOWLEDGEMENTS......................................................................................................................... iii LIST OF FIGURES ............................................................................................
متن کاملComparison of Different Distance Measures on Hierarchical Document Clustering in 2-Pass Retrieval
Hierarchic document clustering has been applied to search results (query-specific clustering ) on the grounds of its potential improved effectiveness compared both to that of static clustering and of conventional inverted file search (IFS). In this paper we review and compare the effects of seven different measures of similarity among documents in hierarchic query specific clustering. We have c...
متن کاملImproving Quality of Clustering using Cellular Automata for Information retrieval
Clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set .A clustering quality measure is a function that, given a data set and it...
متن کاملA Hierarchic Architecture for Conceptual Information Retrieval
Conceptual retrieval returns information related to a speci c topic but not restricted to a query term A common approach is to compare the query with all the documents in the database When the number of documents is large the searching time becomes signi cant In this paper we propose a hierarchic architecture which integrates latent semantic indexing LSI and hierarchic agglomerative clustering ...
متن کاملQEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches
A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
دوره 38 شماره
صفحات -
تاریخ انتشار 2002