Scalable Clustering of Categorical Data and Applications Periklis Andritsos Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2004 Clustering is widely used to explore and understand large collections of data. In this thesis, we introduce LIMBO, a scalable hierarchical categorical clustering algorithm based on the Information Bottleneck (IB) framework for quanti...