Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks. While sharing the same message passing framework, our study shows that different GNNs learn distinct knowledge from graph. This implies potential improvement by distilling complementary multiple models. However, distillation (KD) transfers high-capacity teachers to a lightweight student, which deviates...