The Itanium architecture contains many features to enhance parallel execution, such as an explicitly parallel (EPIC) instruction set, large register files, predication, and support for speculation. It also contains features such as register rotation to support efficient software pipelining of loops. Softwarepipelining techniques have been shown to significantly improve the performance of loop-i...