Virtualizing Lifemapper Software Infrastructure for Biodiversity Expedition

نویسندگان

  • Nadya Williams
  • Aimee Stewart
  • Philip M. Papadopoulos
چکیده

One of the activities of the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) is fostering Virtual Biodiversity Expeditions (VBEs) by bringing domain scientists and cyber infrastructure specialists together as a team. Over the past few years PRAGMA members have been collaborating on virtualizing the Lifemapper software. Virtualization and cloud computing have introduced great flexibility and efficiency into IT projects. Virtualization provides application scalability, maximizes resources utilization, and creates a more efficient, agile, and automated infrastructure. However, there are downsides to the complexity inherent in these environments, including the need for special techniques to deploy cluster hosts, dependence on virtual environments, and challenging application installation, management, and configuration. In this paper, we report on progress of the Lifemapper virtualization framework focused on a reproducible and highly configurable infrastructure capable of fast deployment. Lifemapper is a complex biological software infrastructure developed by the Biodiversity Institute at The University of Kansas that creates and maintains an archive of species distribution maps calculated from public specimen data and a suite of data and tools for biodiversity researchers that calculate single and multi-species distribution predictions and macroecological analyses. Our goal is to create a viable virtualization solution that can be easily adopted and reused by scientists at multiple institutions and projects. This solution 1) allows fast deployment of ready-made cluster images; 2) reproduces the complete Lifemapper processing pipeline on demand at multiple sites and in different hosting environments; and 3) enables scientists to perform Lifemapper-facilitated data processing on restricted-use data, very large datasets, or other unique data. A key contribution of this work is describing the practical experience in taking a complex, clustered, domain-specific, data analysis and simulation system and making it available to operate on a variety of system configurations. Uses of this portability range from whole cluster replication to teaching and experimentation on a single laptop. System virtualization is used to practically define and make portable the full application stack, including all of its complex set of supporting software.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The use of the GARP genetic algorithm and internet grid computing in the Lifemapper world atlas of species biodiversity

Lifemapper (http://www.lifemapper.org) is a predictive electronic atlas of the Earth’s biological biodiversity. Using a screensaver version of the GARP genetic algorithm for modeling species distributions, Lifemapper harnesses vast computing resources through ’volunteers’ PCs similar to SETI@home, to develop models of the distribution of the worlds fauna and flora. The Lifemapper project’s prim...

متن کامل

Rescuing biogeographic legacy data: The "Thor" Expedition, a historical oceanographic expedition to the Mediterranean Sea

BACKGROUND This article describes the digitization of a series of historical datasets based οn the reports of the 1908-1910 Danish Oceanographical Expeditions to the Mediterranean and adjacent seas. All station and sampling metadata as well as biodiversity data regarding calcareous rhodophytes, pelagic polychaetes, and fish (families Engraulidae and Clupeidae) obtained during these expeditions ...

متن کامل

Data management aspects of public engagement with biodiversity documentation

Technological developments open up new opportunities for collaboration between biodiversity researchers and the general public. Three exemplary case studies were reviewed from literature: digitizing museum specimens, text-mining archived expedition journals and handling environmental monitoring data. Data management principles were applied to refine the ensuing requirements. Specific requiremen...

متن کامل

Digging for historical data on the occurrence of benthic macrofaunal species in the southeastern Mediterranean

BACKGROUND The benthic macrofaunal biodiversity of the southeastern Mediterranean is considerably understudied compared to other Mediterranean regions. Monitoring biodiversity in this area is crucial as this region is particularly susceptible to biological invasions and temperature alteration. Historical biodiversity data could provide a useful baseline for monitoring potential changes and prov...

متن کامل

Energy Conservation in Multi-Tenant Networks through Power Virtualization

In the service-centric Internet, multiple virtual services (tenants) are overlayed on top of the same infrastructure (both in wide-area networks and in datacenter networks). We propose conserving energy, in this setting, by virtualizing network power consumed by each tenant, feeding back that information to the tenant, and incentivizing the tenant to conserve energy by making their bill proport...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2017