نتایج جستجو برای: optimize communication

تعداد نتایج: 407575  

2014
Khalid Hasanov Jean-Noël Quintin Alexey L. Lastovetsky

There has been a significant research in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research works are done to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a very simple and at the same time general approach to opt...

2008
Michael M. Wolf Erik G. Boman Bruce A. Hendrickson

Sparse matrix times vector multiplication is an important kernel in scientific computing. We study how to optimize the performance of this operation in parallel by reducing communication. We review existing approaches and present a new partitioning method for symmetric matrices. Our method is simple and can be implemented using existing software for hypergraph partitioning. Experimental results...

2001
Karthik Rajan Jan Harkes

An advanced distributed file system like Coda is an integral part of today’s mobile environment. This is due to the limitations of having large storage capacities on small mobile devices, finite battery life and the ubiquitous nature of wireless networking today. However, systems such as Coda do not perform optimally on low bandwidth connections such as low-speed error prone wireless links and ...

2006
Aaron Becker Abhinav S Bhatele Chao Mei

We have taken leanMD, a molecular dynamics application developed at PPL, and tried to optimize its performance on the Blue Gene architecture. We examined the sequential and parallel performance of leanMD both in terms of execution time and memory usage. We identified the hot spots which restrict sequential performance. To boost parallel performance, we attempted to optimize the calculation of l...

2013
Tan Nguyen Scott B. Baden

We discuss our experience in using Bamboo to automatically optimize a stencil method on an Intel Xeon Phi-based cluster. We describe our solutions to three challenges: tolerating the high cost of inter-node communication, mapping program parallelism to multicore and many-core processors, and balancing workloads on-node across heterogeneous resources. We present results on TACC’s Stampede system...

2005

Often signals and system parameters are most conveniently represented as complex-valued vectors. This occurs, for example, in array processing [1], as well as in communication systems [7] when processing narrowband signals using the equivalent complex baseband representation [2]. Furthermore, in many important applications one attempts to optimize a scalar real-valued measure of performance ove...

2003
Lin Xiao Mikael Johansson Haitham A. Hindi Stephen P. Boyd Andrea J. Goldsmith

We consider a linear system, such as an estimator or a controller, in which several signals are transmitted over wireless communication channels. With the coding and medium access schemes of the communication system fixed, the achievable bit rates are determined by the allocation of communications resources such as transmit powers and bandwidths, to different channels. Assuming conventional uni...

1996
Michèle Dion Cyril Randriamaro Yves Robert

Minimizing communications when mapping affine loop nests onto distributed memory parallel computers has already drawn a lot of attention. This paper focuses on the next step: as it is generally impossible to obtain a communication-free (or local) mapping, how to optimize the residual communications ? We explain how to take advantage of macro-communications such as broadcasts, scatters, gathers ...

1994
Jürgen Dorn Roger M. Kerr

A communication procedure for communicating scheduling expert systems based on fuzzy set theory is proposed. Fuzzy sets are used to express and to exchange constraints and their possible relaxations with other scheduling systems that can interpret these constraints. The procedure is intended to optimize the global evaluation among the communicating systems. An example from steel industry is tak...

2009
Abhinav Bhatelé Laxmikant V. Kalé Nicholas Chen Ralph E. Johnson

Obtaining the best performance from a parallel program involves four important steps: 1. Choice of the appropriate grainsize; 2. Balancing computational and communication load across processors; 3. Optimizing communication by minimizing interprocessor communication and overlap of communication with computation; and 4. Minimizing communication traffic on the network by topology aware mapping. In...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید