نتایج جستجو برای: hadoop

تعداد نتایج: 2553  

2011

The performance of three Hadoop applications is reported for several virtual configurations on VMware vSphere 5 and compared to native configurations. A well-balanced seven-node AMAX ClusterMax system was used to show that the average performance difference between native and the simplest virtualized configurations is only 4%. Further, the flexibility enabled by virtualization to create multipl...

2012
Gylfi Þór Gudmundsson Laurent Amsaleg Björn Þór Jónsson

This paper describes an initial study where the opensource Hadoop parallel and distributed run-time environment is used to speed-up the construction phase of a large high-dimensional index. This paper first discusses the typical practical problems developers may run into when porting their code to Hadoop. It then presents early experimental results showing that the performance gains are substan...

2014
Indranil Gupta

Today, application schedulers are decoupled from routing level schedulers, leading to sub-optimal throughput for cloud computing platforms. In this thesis, we propose a cross-layer scheduling framework that bridges the application level scheduler with the routing level scheduler (SDN). We realize our framework in a batch-processing framework (Hadoop [1]) and a streamprocessing framework (Storm ...

2011

The performance of three Hadoop applications is reported for several virtual configurations on VMware vSphere 5 and compared to native configurations. A well-balanced seven-node AMAX ClusterMax system was used to show that the average performance difference between native and the simplest virtualized configurations is only 4%. Further, the flexibility enabled by virtualization to create multipl...

Journal: :JCP 2013
Dan Wang Jilan Chen Wenbing Zhao

MapReduce is a kind of software framework for easily writing applications which process vast amounts of data on large clusters of commodity hardware. In order to get better allocation of tasks and load balancing, the MapReduce work mode and task scheduling algorithm of Hadoop platform is analyzed in this paper. According to this situation that the number of tasks of the smaller weight job is mo...

Journal: :CoRR 2015
Martin Junghanns André Petermann Kevin Gómez Erhard Rahm

Many Big Data applications in business and science require the management and analysis of huge amounts of graph data. Previous approaches for graph analytics such as graph databases and parallel graph processing systems (e.g., Pregel) either lack sufficient scalability or flexibility and expres-siveness. We are therefore developing a new end-to-end approach for graph data management and analysi...

2013
Alexander Schätzle Martin Przyjaciel-Zablocki Thomas Hornung Georg Lausen

In this paper we discuss PigSPARQL, a competitive yet easy to use SPARQL query processing system on MapReduce that allows adhoc SPARQL query processing on large RDF graphs out of the box. Instead of a direct mapping, PigSPARQL uses the query language of Pig, a data analysis platform on top of Hadoop MapReduce, as an intermediate layer between SPARQL and MapReduce. This additional level of abstr...

2015
Qiuyi Tang Thomas C. Bressoud

With the rapid growth of technology, scientists have realized the challenge of efficiently analyzing large data sets since the beginning of 21 century. Increases in data volume and data complexity shift scientists’ focus to parallel, distributed algorithms running on clusters. In 2004, Jeffrey Dean and Sanjay Ghemawat from Google introduced a new programming model to store and process large dat...

2014
Benjamin Jakobus Peter McBrien

This article presents benchmarking results of two benchmarking sets (run on small clusters of 6 and 9 nodes) applied to Hive and Pig running on Hadoop 0.14.1. The first set of results were obtainted by replicating the Apache Pig benchmark published by the Apache Foundation on 11/07/07 (which served as a baseline to compare major Pig Latin releases). The second results were obtained by applying ...

2013
Myoungjin Kim Yun Cui Seungho Han Hanku Lee

In this paper, we propose a Hadoop-based Distributed Video Transcoding System in a cloud computing environment that transcodes various video codec formats into the MPEG-4 video format. This system provides various types of video content to heterogeneous devices such as smart phones, personal computers, television, and pads. We design and implement the system using the MapReduce framework, which...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید