As we saw in the previous lecture, solving this optimization recovers a linear classifier of the form y = sign(w ·h(x)+w0) that minimizes the hinge loss for all misclassified points and maximizes the size of the margin (the distance to the closest point to the decision boundary). The term “support vector” refers to the vectors from the decision boundary to the closest points. Note that moving a...