Abstract Learning discriminative representations with deep neural networks often relies on massive labeled data, which is expensive and difficult to obtain in many real scenarios. As an alternative, self-supervised learning that leverages input itself as supervision strongly preferred for its soaring performance visual representation learning. This paper introduces a contrastive framework gener...