AI models of code have made significant progress over the last few years. However, many are actually not learning task-relevant source features. Instead, they often fit non-relevant but correlated data, leading to a lack robustness and generalizability, limiting subsequent practical use such models. In this work, we focus on improving model quality through signal awareness , i.e., relevant sign...