Low bitwidth integer arithmetic has been widely adopted in hardware implementations of deep neural network inference applications. However, despite the promised energy-efficiency improvements demanding edge applications, use low for training remains limited. Unlike inference, demands high dynamic range and numerical accuracy quality results, making low-bitwidth particularly challenging. To addr...