Consider the setup where one wants to apply ASR in the presence of background music, as is often the case with broadcast news or documentary programmes. The speech of these is of good quality, a human has no problem understanding it, but an ASR system (although very robust against moderate white noise) fails terribly when the noise is more structured, since music is disturbing the frequency spe...