Data available in software engineering for many applications contains variability and it is not possible to say which variable helps the process of prediction. Most work present defect prediction focused on selection best techniques. For this purpose, deep learning ensemble models have shown promising results. In contrast, there are very few researches that deals with cleaning training data par...