We propose novel deep speaker representation learning that considers perceptual similarity among speakers for multi-speaker generative modeling. Following its success in accurate discriminative modeling of individuality, knowledge (i.e., using neural networks) has been introduced to However, the conventional algorithm does not necessarily learn embeddings suitable such modeling, which may resul...