Integrating model-based machine learning methods into deep neural architectures allows one to leverage both the expressive power of nets and ability incorporate domain-specific knowledge. In particular, many works have employed expectation maximization (EM) algorithm in form an unrolled layer-wise structure that is jointly trained with a backbone network. However, it difficult discriminatively ...