نتایج جستجو برای: mapreduce
تعداد نتایج: 3018 فیلتر نتایج به سال:
Similarity joins represent a useful operator for data mining, data analysis and data exploration applications. With the exponential growth of data to be analyzed, distributed approaches like MapReduce are required. So far, the state-of-the-art similarity join approaches based on MapReduce mainly focused on the processing of low-dimensional vector data. In this paper, we revisit and investigate ...
<p>Penyediaan <em>file</em> video dengan <em>bitrate</em> bervariasi menjadi syarat utama bagi layanan <em>Video On Demand</em> yang menerapkan <em>adaptive streaming</em>. Hal tersebut dilakukan <em>transcoding</em> menghasilkan<em> </em>video <em>multi-bitrate</em>. Proses <em>multi-bitrate</em&...
A popular programming paradigm in the cloud, MapReduce is extensively considered and used for “big data” analysis. Unfortunately, a great many “big data” applications require capabilities beyond those originally intended by MapReduce, often burdening developers to write unnatural non-obvious MapReduce programs so as to twist the underlying system to meet the requirements. In this paper, we focu...
MapReduce is one of the most popular parallel data processing systems, and it has been widely used in many fields. As one of the most important techniques in MapReduce, task scheduling strategy is directly related to the system performance. However, in multi-user shared MapReduce environments, the existing task scheduling algorithms cannot provide high system throughput when processing batch jo...
K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce [3]. However, the existing k-means parallelization methods including PKMeans have many limitations. It can’t finish all its iterations in one MapReduce job, so it has to repeat cascading MapReduce jobs in a loop until convergence. On...
The exponentially growing data demands of modern enterprise and scientific applications poses critical challenges in sustaining the applications at scale. The MapReduce [1] programming model has served as the key enabler for executing resource-intensive applications over huge datasets. However, its configuration design-space has not been studied in detail. This is a complex problem as a typical...
Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query lang...
( LADIS'2011), held in conjunction with VLDB'2011, Seattle, Washington, Sept. 2-3, 2011. SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell HP Laboratories HPL-2011-126 MapReduce; Hadoop; performance models; completion time prediction; resource allocation There is an increasing number of MapReduce applications, e.g., persona...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید