The availability of an increasing number of full genome sequences and high-performance computing provides the basis for analyzing global properties of the genomic text. Methods designed to study linguistic properties of nucleotide sequences have been extensively developed. However, the rules of genome construction remain to be discovered. Chargaff experimentally determined A = T and G = C equim...