Parallel compositing techniques have traditionally focused on distributed memory architectures with communication of pixel values usually being the main bottleneck. On shared memory architectures, communication is handled through memory accesses, obviating the need for explicit communication steps. Shared memory architectures with multiple graphics accelerators provide the capability for parall...