Current speech recognition systems uniformly employ short-time spectral analysis, usually over windows of 1030 ms, as the basis for their acoustic representations. Any detail below this timescale is lost, and even temporal structure above this level is usually only weakly represented in the form of deltas etc. We address this limitation by proposing a novel representation of the temporal envelo...