subject to q(f |u) = ∏ i q(fi|u) and ∫ dfiq(fi|u) = 1. It is noted that KL(a||b) is the measurement of information “lost” when using b to approximate a. It was argued in [1] that it is appropriate to use this KL divergence as an approximation measure since we are trying to find a sparse representation u and its relationship with f to approximate p by q. The KL divergence above can be expanded a...