A two-level scheduling method: an effective parallelizing technique for uniform nested loops on a DSP multiprocessor
نویسندگان
چکیده
A digital signal processor (DSP), which is a special-purpose microprocessor, is designed to achieve higher performance on DSP applications. Because most DSP applications contain many nested loops and permit a very high degree of parallelism, the DSP multiprocessor has a suitable architecture to execute these applications. Unfortunately, conventional scheduling methods used on DSP multiprocessors allocate only one operation to each DSP every time unit, even if the DSP includes several function units that can operate in parallel. Obviously they cannot achieve full function unit utilization. Hence, in this paper, we propose a two-level scheduling method (TSM) to overcome this common failing. TSM contains two approaches, which integrates unimodular transformations, loop tiling technique, and conventional methods used on single DSP. Besides introducing algorithm, we also use an analytic module to analyze its preliminary performance. Based on our analyses the TSM can achieve shorter execution time and more scalable speedup results. In addition, the TSM causes less memory access and synchronization overheads, which are usually negligible in the DSP multiprocessor architecture. 2004 Elsevier Inc. All rights reserved.
منابع مشابه
A Release Combined Scheduling Scheme for Non-Uniform Dependence Loops
In general, synchronization mechanisms can be used to preserve dependence constraints in any nested loop, and can be combined with a loop scheduling scheme to form a uniform framework to obtain the correct execution order and balance workload distribution. Most current scheduling mechanisms cannot handle non-uniform dependence loops. In this paper, we propose a new combined scheduling scheme ca...
متن کاملMinimization of Memory Access Overhead for Multi-dimensional Dsp Applications via Multi-level Partitioning and Scheduling
Massive uniform nested loops are broadly used in multi-dimensional DSP applications. Due to the large amount of data handled by such applications, the optimization of data accesses by fully utilizing the local memory and minimizing communication overhead is important in order to improve the overall system performance. Most of the traditional partition strategies do not consider the eeect of dat...
متن کاملUsing knowledge-based systems for research on parallelizing compilers
The main function of parallelizing compilers is to analyze sequential programs, in particular the loop structure, to detect hidden parallelism and automatically restructure sequential programs into parallel subtasks that are executed on a multiprocessor. This article describes the design and implementation of an efficient parallelizing compiler to parallelize loops and achieve high speedup rate...
متن کاملChain Pattern Scheduling for nested loops ∗
It is well known that most time consuming applications consist of nested DO(FOR) loops. The iterations within a loop nest can are either independent iterations or precedence constrained iterations. Furthermore, the precedence constraints can be uniform (constant) or non-uniform throughout the execution of the program. The index space of a uniform dependence loop, due to the existence of depende...
متن کاملTiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems
Nested loops are one of the most time-consuming parts and the largest sources of parallelism in many scientific applications. In this paper, we address the problem of 3-dimensional tiling and scheduling of three-level perfectly nested loops with dependencies on heterogeneous systems. To exploit the parallelism, we tile and schedule nested loops with dependencies by awareness of computational po...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Systems and Software
دوره 75 شماره
صفحات -
تاریخ انتشار 2005