نتایج جستجو برای: mapreduce

تعداد نتایج: 3018  

2017
Premysl Cech Jakub Marousek Jakub Lokoc Yasin N. Silva Jeremy Starks

Similarity joins represent a useful operator for data mining, data analysis and data exploration applications. With the exponential growth of data to be analyzed, distributed approaches like MapReduce are required. So far, the state-of-the-art similarity join approaches based on MapReduce mainly focused on the processing of low-dimensional vector data. In this paper, we revisit and investigate ...

Journal: :IEEE Transactions on Information Theory 2019

Journal: :Jurnal Teknologi Informasi dan Ilmu Komputer 2023

<p>Penyediaan <em>file</em> video dengan <em>bitrate</em> bervariasi menjadi syarat utama bagi layanan <em>Video On Demand</em> yang menerapkan <em>adaptive streaming</em>. Hal tersebut dilakukan <em>transcoding</em> menghasilkan<em> </em>video <em>multi-bitrate</em>. Proses <em>multi-bitrate</em&...

Journal: :IPSJ Online Transactions 2014

2014
Zhao Cao Shimin Chen Dongzhe Ma Jianhua Feng Min Wang

A popular programming paradigm in the cloud, MapReduce is extensively considered and used for “big data” analysis. Unfortunately, a great many “big data” applications require capabilities beyond those originally intended by MapReduce, often burdening developers to write unnatural non-obvious MapReduce programs so as to twist the underlying system to meet the requirements. In this paper, we focu...

2014
Xite Wang Derong Shen Ge Yu Tiezheng Nie Yue Kou

MapReduce is one of the most popular parallel data processing systems, and it has been widely used in many fields. As one of the most important techniques in MapReduce, task scheduling strategy is directly related to the system performance. However, in multi-user shared MapReduce environments, the existing task scheduling algorithms cannot provide high system throughput when processing batch jo...

Journal: :CoRR 2016
Shikai Jin Yuxuan Cui Chunli Yu

K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce [3]. However, the existing k-means parallelization methods including PKMeans have many limitations. It can’t finish all its iterations in one MapReduce job, so it has to repeat cascading MapReduce jobs in a loop until convergence. On...

2009
Guanying Wang Ali R. Butt Prashant Pandey Karan Gupta

The exponentially growing data demands of modern enterprise and scientific applications poses critical challenges in sustaining the applications at scale. The MapReduce [1] programming model has served as the key enabler for executing resource-intensive applications over huge datasets. However, its configuration design-space has not been studied in detail. This is a complex problem as a typical...

Journal: :PVLDB 2012
Iman Elghandour Ashraf Aboulnaga

Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query lang...

2011
Abhishek Verma Ludmila Cherkasova Roy H. Campbell

( LADIS'2011), held in conjunction with VLDB'2011, Seattle, Washington, Sept. 2-3, 2011.  SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell HP Laboratories HPL-2011-126 MapReduce; Hadoop; performance models; completion time prediction; resource allocation There is an increasing number of MapReduce applications, e.g., persona...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید