Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there multiple sound units in each input utterance, (2) is no lexicon of during the pre-training phase, and (3) have variable lengths with explicit segmentation. To deal these problems, we propose Hidden-Unit BERT (HuBERT) approach self-supervised learning, which utilizes an offline cluste...