نتایج جستجو برای: hadoop

تعداد نتایج: 2553  

Journal: :PVLDB 2014
Avrilia Floratou Umar Farooq Minhas Fatma Özcan

SQL query processing for analytics over Hadoop data has recently gained significant traction. Among many systems providing some SQL support over Hadoop, Hive is the first native Hadoop system that uses an underlying framework such as MapReduce or Tez to process SQL-like statements. Impala, on the other hand, represents the new emerging class of SQL-on-Hadoop systems that exploit a shared-nothin...

Journal: :PVLDB 2011
Avrilia Floratou Jignesh M. Patel Eugene J. Shekita Sandeep Tata

Users of MapReduce often run into performance problems when they scale up their workloads. Many of the problems they encounter can be overcome by applying techniques learned from over three decades of research on parallel DBMSs. However, translating these techniques to a MapReduce implementation such as Hadoop presents unique challenges that can lead to new design choices. This paper describes ...

2016
Petra Zimmer Frank Reussner

Gaining an insight on the company’s mass of data was a common goal in the last few years. But information is growing exponentially and companies yearn for a data management system that is able to work with heterogenic data from different sources. A possible answer is the Hadoop Data Platform. With its diverse components, it makes several ways of data management as a foundation for the analysis....

2009
Jimmy J. Lin Tamer Elsayed Lidan Wang Donald Metzler

This paper describes Ivory, an attempt to build a distributed retrieval system around the open-source Hadoop implementation of MapReduce. We focus on three noteworthy aspects of our work: a retrieval architecture built directly on the Hadoop Distributed File System (HDFS), a scalable MapReduce algorithm for inverted indexing, and webpage classification to enhance retrieval effectiveness.

2009
Vinayak Borkar Michael Carey

We demonstrate Hyrax, a new runtime platform for dataparallel computation under development at UC Irvine under the ASTERIX project. We show the versatility of Hyrax by using it to run XQuery queries from ASTERIX, Hadoop MapReduce jobs using a Hadoop emulation layer, and SQL queries originating from Hive.

Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed.  A categorized dictiona...

2013
Jiong Xie FanJun Meng Hailong Wang HongFang Pan JinHong Cheng Xiao Qin

In this paper, we import a prefetching mechanism into MapReduce model while retaining compatibility with the native Hadoop. Given a dataintensive application running on a Hadoop cluster, our approach estimates the execution time of each task and adaptively preloads an amount of data to the memory before the new task is assigned to the computing node.

2012
Jin Kyu Kim

The combination of Hadoop and HDFS is becoming a defacto standard system in handling big data. HDFS is a distributed file system that is designed for big data. In HDFS, a file consists of multiple large sized blocks. A central management of HDFS tries to scatter these multiple blocks on different nodes to maximize the I/O throughput. Hadoop is a framework that supports data intensive parallel a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید