نتایج جستجو برای: Hadoop

تعداد نتایج: 2553  

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

Journal: :PVLDB 2010
Jens Dittrich Jorge-Arnulfo Quiané-Ruiz Alekh Jindal Yagiz Kargin Vinay Setty Jörg Schad

MapReduce is a computing paradigm that has gained a lot of attention in recent years from industry and research. Unlike parallel DBMSs, MapReduce allows non-expert users to run complex analytical tasks over very large data sets on very large clusters and clouds. However, this comes at a price: MapReduce processes tasks in a scan-oriented fashion. Hence, the performance of Hadoop — an open-sourc...

2017
Louai Alarabi Mohamed F. Mokbel Mashaal Musleh

This paper presents ST-Hadoop; the first full-fledged opensource MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data t...

Journal: :CoRR 2013
Woo-Cheol Kim Changryong Baek Dongwon Lee

In recent years, much research has focused on how to optimize Hadoop jobs. Their approaches are diverse, ranging from improving HDFS and Hadoop job scheduler to optimizing parameters in Hadoop configurations. Despite their success in improving the performance of Hadoop jobs, however, very little is known about the limit of their optimization performance. That is, how optimal is a given Hadoop o...

2013
Dominique A. Heger

Hadoop represents a Java-based distributed computing framework that is designed to support applications that are implemented via the MapReduce programming model. In general, workload dependent Hadoop performance optimization efforts have to focus on 3 major categories. Namely the systems HW, the systems SW, and the configuration and tuning/optimization of the Hadoop infrastructure components. F...

2017
Guru Prasad Swathi Prabhu

The most popular open source distributed computing framework called Hadoop was designed by Doug Cutting and his team, which involves thousands of nodes to process and analyze huge amounts of data called Big Data. The major core components of Hadoop are HDFS (Hadoop Distributed File System) and MapReduce. This framework is the most popular and powerful for store, manage and process Big Data appl...

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

Journal: :JNW 2013
Wenhui Lin Jun Liu

The research of Hadoop is an important part of cloud computing industry, and Hadoop performance research is a key research direction. The Hadoop performance analysis as a basic work can provide important reference for other performance optimization researches. In this paper, based on previous researches of server performance analysis, we propose a node performance measurement method on Hadoop. ...

Journal: :International Journal of Computer Applications 2015

2013
Jiong Xie Yun Tian Shu Yin Ji Zhang Xiaojun Ruan Xiao Qin

MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop–an open-source implementation of MapReduce is widely used for short jobs requiring low response time. In this paper, We proposed a new preshuffling strategy in Hadoop to reduce high network loads imposed by shuffle-intensive applications. Designing...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید