Abstract In our data-flooded age, an enormous amount of redundant, but also disparate textual data is collected on a daily basis wide variety topics. Much this information refers to documents related the same theme, that is, different versions document, or discussing topic. Being aware such differences turns out be important aspect for those who want perform comparative task. However, as increa...