Optimization and Parallelization Experiences Using Hardware Performance Counters
نویسندگان
چکیده
Current hardware for compute intensive tasks includes a large amount of processing faan optimized way. High performance computing (HPC) is always focused in solving challenging (or, at least, compute intensive) problems for which the response time is the priority. We have been working from two different but usually complementary research problems: a) updating and parallelizing legacy (HPC/numerical) software, and b) analyzing different problems and approaches to optimization and parallel processing in clusters. We have found that raw hardware event counters do not always directly provide useful information. We also found some guidelines for evaluating performance using those counters in the context of optimization and parallelization. In this article, we present those guidelines along with the performance evaluation tools that we used to determine objectively what parts of the algorithm offered better chances of improvement.
منابع مشابه
Optimization and Paralllelization Experiences Using Hardware Performance Counters
Current hardware for compute intensive tasks includes a large amount of processing facilities which is sometimes hard to use in an optimized way. High performance computing (HPC) is always focused in solving grand challenge (or, at least, compute intensive) problems for which the response time is the priority. We have been working from two different but usually complementary research problems: ...
متن کاملExperiences and Lessons Learned with a Portable Interface to Hardware Performance Counters
The PAPI project has defined and implemented a crossplatform interface to the hardware counters available on most modern microprocessors. The interface has gained widespread use and acceptance from hardware vendors, users, and tool developers. This paper reports on experiences with the community-based open-source effort to define the PAPI specification and implement it on a variety of platforms...
متن کاملObtaining Hardware Performance Metrics for the BlueGene/L Supercomputer
Hardware performance monitoring is the basis of modern performance analysis tools for application optimization. We are interested in providing such performance analysis tools for the new BlueGene/L supercomputer as early as possible, so that applications can be tuned for that machine. We are faced with two challenges in achieving that goal. First, the machine is still going through its final de...
متن کاملExploiting performance counters to predict and improve energy performance of HPC systems
Hardware monitoring through performance counters is available on almost all modern processors. Although these counters are originally designed for performance tuning, they have also been used for evaluating power consumption. We propose two approaches for modelling and understanding the behaviour of high performance computing (HPC) systems relying on hardware monitoring counters. We evaluate th...
متن کاملMicroarchitectural Characterization of Production JVMs and Java Workloads
Understanding and comparing Java Virtual Machine (JVM) performance at a microarchitectural level can identify JVM performance anomalies and potential opportunities for optimization. The two primary tools for microarchitectural performance analysis are hardware performance counters and cycle accurate simulators. Unfortunately, the nondeterminism, complexity, and size of modern JVMs make these to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014