We study algorithms for approximating pairwise similarity matrices that arise in natural language processing. Generally, computing a matrix n data points requires Omega(n^2) computations. This quadratic scaling is significant bottleneck, especially when similarities are computed via expensive functions, e.g., transformer models. Approximation methods reduce this complexity, often by using small...