Achieving Order through CHAOS: the LLNL HPC Linux Cluster Experience

نویسندگان

  • Ryan L. Braby
  • Jim E. Garlick
  • Robin J. Goldstone
چکیده

Since fall 2001, Livermore Computing at Lawrence Livermore National Laboratory has deployed 11 Intel IA32-based Linux clusters ranging in size up to 1154 nodes. All provide a common programming model and implement a similar cluster architecture. Hardware components are carefully selected for performance, usability, manageability, and reliability and are then integrated and supported using a strategy that evolved from practical experience. Livermore Computing Linux clusters run a common software environment that is developed and maintained inhouse while drawing components and additional support from the open source community and industrial partnerships. The environment is based on Red Hat Linux and adds kernel modifications, cluster system management, monitoring and failure detection, resource management, authentication and access control, development environment, and parallel file system. The overall strategy has been successful and demonstrates that world-class high-performance computing resources can be built and maintained using commodity off-the-shelf hardware and open source software.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paravirtualization for HPC Systems

In this work, we investigate the efficacy of using paravirtualizing software for performance-critical HPC kernels and applications. We present a comprehensive performance evaluation of Xen, a low-overhead, Linux-based, virtual machine monitor, for paravirtualization of HPC cluster systems at LLNL. We investigate subsystem and overall performance using a wide range of benchmarks and applications...

متن کامل

Paravirtualization for Hpc Systems * Ucsb Computer Science Technical Report Number 2006-10

Virtualization has become increasingly popular for enabling full system isolation, load balancing, and hardware multiplexing. This wide-spread use is the result of novel techniques such as paravirtualization that make virtualization systems practical and efficient. Paravirtualizing systems export an interface that is slightly different from the underlying hardware but that significantly streaml...

متن کامل

It’s Magic: SourceMage GNU/Linux as HPC Cluster OS

The goal of the presentation is to give an overview about how to build a commodity PC based GNU/Linux cluster for High Performance Computing (HPC) in a research environment. Due to the extreme flexibility of the GNU/Linux operating system and the large variety of hardware components, building a cluster for High Performance Computing (HPC) is still a challenge in many cases. At the Division of I...

متن کامل

Obelisk: Summoning Minions on a HPC Cluster

In scientific research, having the ability to perform rigorous calculations in a bearable amount of time is an invaluable asset. Fortunately, the growing popularity of distributed systems at universities makes this a widely accessible resource. However, in order to use such computing resources, one must understand Linux, parallel computing, and distributed systems. Unfortunately, most people do...

متن کامل

Application Performance on the Tri-Lab Linux Capacity Cluster - TLCC

In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This chapter examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003