نتایج جستجو برای: hadoop

تعداد نتایج: 2553  

2014

Nowadays, a large volume of data from various resources such as social media networks, sensory devices and other information serving devices are produced. This large collection of unstructured, semi structured data is called big data. The conventional databases and data ware houses can’t process this data. So we need new data processing tools. Hadoop addresses this need. Hadoop is an open sourc...

2014
Avrilia Floratou Fatma Özcan Berni Schiefer

Benchmarks are important tools to evaluate systems, as long as their results are transparent, reproducible and they are conducted with due diligence. Today, many SQL-on-Hadoop vendors use the data generators and the queries of existing TPC benchmarks, but fail to adhere to the rules, producing results that are not transparent. As the SQL-on-Hadoop movement continues to gain more traction, it is...

Journal: :IJCAC 2011
Zhiwei Xu Bo Yan Yongqiang Zou

As a main subfield of cloud computing applications, internet services require large-scale data computing. Their workloads can be divided into two classes: customer-facing query-processing interactive tasks that serve hundreds of millions of users within a short response time and backend data analysis batch tasks that involve petabytes of data. Hadoop, an open source software suite, is used by m...

Journal: :International Journal of Applied Information Systems 2013

2017
Naresh Kumar

Bigdata is linked with the entireties of composite data sets. In bigdata environment, data is in the form of unstructured data and may contain number of duplicate copies of same data. To manage such a complex unstructured data hadoop is to be used. A hadoop is an open source platform specially designed for bigdata environment. Hadoop can handle unstructured data very efficiently as compare to t...

2013
Bing Dong

Research and optimization of the Bloom filter algorithm in Hadoop An increasing number of enterprises have the need of transferring data from a traditional database to a cloud-computing system. Big data in Teradata (a data warehouse) often needs to be transferred to Hadoop, a distributed system, for further computing and analysis. However, if data stored in Teradata is not synced with Hadoop, e...

Journal: :Applied Mathematics and Computer Science 2011
Horacio González-Vélez Maryam Kontagora

This work analyses the performance of Hadoop, an implementation of the MapReduce programming model for distributed parallel computing, executing on a virtualisation environment comprised of 1+16 nodes running the VMWare workstation software. A set of experiments using the standard Hadoop benchmarks has been designed in order to determine whether or not significant reductions in the execution ti...

2013
Seung-Tae Hong Young-Sung Shin Dong Hoon Choi Heeseung Jo Jae-Woo Chang

With the evolution of IT technologies, large-scale graph data have lately become a growing interest. As a result, there are a lot of research results in large-scale graph analysis on Hadoop. The graph analysis based on Hadoop provides parallel programming models with data partitioning and contains iterative phases of MapReduce jobs. Therefore, the effectiveness of data partitioning depends on h...

2016
Ibrahim Vazirabad

The Data Warehousing field has been fundamentally changed by the Big Data revolution. Computational and storage methodologies such as Hadoop provide an alternate way of managing and analyzing the torrent of data that is flooding in from all manner of instrumentation. This review article will elucidate the relationship between traditional enterprise data warehousing and one of the primary analyt...

2011
Ariel Rabkin Randy Katz

Hadoop is among today’s most widely deployed “big data” systems. Cloudera is a company offering paid Hadoop services and support. This poster abstract describes lessons from examining a sample of 293 support tickets, from February through July of 2011. We manually labelled the tickets in our sample with the established root cause and the specific system component being worked on. Tickets cover ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید