mapreduce

نتایج جستجو برای: mapreduce

تعداد نتایج: 3018 فیلتر نتایج به سال:

Comparing MapReduce-Based k-NN Similarity Joins on Hadoop for High-Dimensional Data

2017

Premysl Cech Jakub Marousek Jakub Lokoc Yasin N. Silva Jeremy Starks

Similarity joins represent a useful operator for data mining, data analysis and data exploration applications. With the exponential growth of data to be analyzed, distributed approaches like MapReduce are required. So far, the state-of-the-art similarity join approaches based on MapReduce mainly focused on the processing of low-dimensional vector data. In this paper, we revisit and investigate ...

متن کامل

Wireless MapReduce Distributed Computing

Journal: :IEEE Transactions on Information Theory 2019

متن کامل

Mapreduce dalam Layanan Transcoding

Journal: :Jurnal Teknologi Informasi dan Ilmu Komputer 2023

Penyediaan file video dengan bitrate bervariasi menjadi syarat utama bagi layanan Video On Demand yang menerapkan adaptive streaming. Hal tersebut dilakukan transcoding menghasilkan video multi-bitrate. Proses multi-bitrate</em&...

متن کامل

Accumulative Computation on MapReduce

Journal: :IPSJ Online Transactions 2014

متن کامل

Efficient and Flexible Index Access in MapReduce

2014

Zhao Cao Shimin Chen Dongzhe Ma Jianhua Feng Min Wang

A popular programming paradigm in the cloud, MapReduce is extensively considered and used for “big data” analysis. Unfortunately, a great many “big data” applications require capabilities beyond those originally intended by MapReduce, often burdening developers to write unnatural non-obvious MapReduce programs so as to twist the underlying system to meet the requirements. In this paper, we focu...

متن کامل

A Throughput Driven Task Scheduler for Batch Jobs in Shared MapReduce Environments

2014

Xite Wang Derong Shen Ge Yu Tiezheng Nie Yue Kou

MapReduce is one of the most popular parallel data processing systems, and it has been widely used in many fields. As one of the most important techniques in MapReduce, task scheduling strategy is directly related to the system performance. However, in multi-user shared MapReduce environments, the existing task scheduling algorithms cannot provide high system throughput when processing batch jo...

متن کامل

A New Parallelization Method for K-means

Journal: :CoRR 2016

Shikai Jin Yuxuan Cui Chunli Yu

K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce [3]. However, the existing k-means parallelization methods including PKMeans have many limitations. It can’t finish all its iterations in one MapReduce job, so it has to repeat cascading MapReduce jobs in a loop until convergence. On...

متن کامل

Using Realistic Simulation to Identify I/O Bottlenecks in MapReduce Setups

2009

Guanying Wang Ali R. Butt Prashant Pandey Karan Gupta

The exponentially growing data demands of modern enterprise and scientific applications poses critical challenges in sustaining the applications at scale. The MapReduce [1] programming model has served as the key enabler for executing resource-intensive applications over huge datasets. However, its configuration design-space has not been studied in detail. This is a complex problem as a typical...

متن کامل

ReStore: Reusing Results of MapReduce Jobs

Journal: :PVLDB 2012

Iman Elghandour Ashraf Aboulnaga

Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query lang...

متن کامل

SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs

2011

Abhishek Verma Ludmila Cherkasova Roy H. Campbell

( LADIS'2011), held in conjunction with VLDB'2011, Seattle, Washington, Sept. 2-3, 2011.  SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell HP Laboratories HPL-2011-126 MapReduce; Hadoop; performance models; completion time prediction; resource allocation There is an increasing number of MapReduce applications, e.g., persona...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید