Triangles are the basic substructure of networks and triangle counting (TC) has been a fundamental graph computing problem in numerous fields such as social network analysis. Nevertheless, like other problems, due to high memory-computation ratio random memory access pattern, TC involves large amount data transfers thus suffers from bandwidth bottleneck traditional Von-Neumann architecture. To ...