Knowledge distillation has gained a lot of interest in recent years because it allows for compressing large deep neural network (teacher DNN) into smaller DNN (student DNN), while maintaining its accuracy. Recent improvements have been made to knowledge distillation. One such improvement is the teaching assistant method. This method involves introducing an intermediate "teaching assistant" mode...