Tokenization represents the way of segmenting a piece text into smaller units called tokens. Since Arabic is an agglutinating language by nature, this treatment becomes crucial preprocessing step for many Natural Language Processing (NLP) applications such as morphological analysis, parsing, machine translation, information extraction, and so on. In article, we investigate word tokenization tas...