The most widely studied problem in machine learning is supervised learning. We are given a labeled training set of input-output pairs, D = (xi, yi)i=1, and have to learn a way to predict the output or target ỹ for a novel test input x̃ (i.e, for x̃ 6∈ D). (We use the tilde notation to denote test cases that we have not seen before.) Some examples include: predicting if someone has cancer ỹ ∈ {0, ...