In this work, we study the k-means cost function. The (Euclidean) k-means problem can be described as follows: given a dataset X ⊆ R and a positive integer k, find a set of k centers C ⊆ R such that Φ(C,X) def = ∑ x∈X minc∈C ||x− c|| 2 is minimized. Let ∆k(X) def = minC⊆Rd Φ(C,X) denote the cost of the optimal k-means solution. It is simple to observe that for any dataset X, ∆k(X) decreases as ...