نتایج جستجو برای: hadoop

تعداد نتایج: 2553  

2014
Y. K. Patil V. S. Nandedkar

Document clustering is one of the important areas in data mining. Hadoop is being used by the Yahoo, Google, Face book and Twitter business companies for implementing real time applications. Email, social media blog, movie review comments, books are used for document clustering. This paper focuses on the document clustering using Hadoop. Hadoop is the new technology used for parallel computing ...

2012
Harcharan Jit Singh V. P. Singh

In data intensive computing, Hadoop is widely used by organizations. The client applications of Hadoop require high availability and scalability of the system. Mostly, these applications are online and their data growth rate is unpredictable. The present Hadoop relies on secondary namenode for failover which slows down the performance of the system. Hadoop system’s scalability depends on the ve...

2012
Mohamed H. Almeer

Image processing algorithms related to remote sensing have been tested and utilized on the Hadoop MapReduce parallel platform by using an experimental 112core high-performance cloud computing system that is situated in the Environmental Studies Center at the University of Qatar. Although there has been considerable research utilizing the Hadoop platform for image processing rather than for its ...

2014
K. H Naz Anisha Rodrigues

The increasing use of computing resource in our daily lives leads to data generation at an astonishing rate. The computing industry is being repeatedly questioned for its ability to accommodate the unpredictable growth rate of data.It has encouraged the development. Hadoop consists of Hadoop Mapreduce and Hadoop Distributed File System (HDFS), is a platform for large scale data and processing. ...

2012
Diana Maria Moise Frédéric DESPREZ Gabriel ANTONIU

ions were developed based on MapReduce, with the goal of providing a simple-touse interface for expressing database-like queries [64, 6]. Bioinformatics is one of the numerous research domains that employ MapReduce to model their algorithms [69, 58, 56]. As an example, CloudBurst [69] is a MapReduce-based algorithm for mapping next-generation sequence data to the human genome and other referenc...

Journal: :Electronic proceedings in theoretical computer science 2021

The Hadoop scheduler is a centerpiece of Hadoop, the leading processing framework for data-intensive applications in cloud. Given impact failures on performance running testing and verifying critical. Existing approaches such as simulation analytical modeling are inadequate because they not able to ascertain complete verification scheduler. This due wide range constraints aspects involved Hadoo...

2015
Hamza Zafar Farrukh Aftab Khan Bryan Carpenter Aamir Shafi Asad Waqar Malik

Many organizations—including academic, research, commercial institutions—have invested heavily in setting up High Performance Computing (HPC) facilities for running computational science applications. On the other hand, the Apache Hadoop software—after emerging in 2005— has become a popular, reliable, and scalable open-source framework for processing large-scale data (Big Data). Realizing the i...

2011

The overall goal of this project is to gain a hands-on experience with working on a large open-ended research-oriented project using the Hadoop framework. Hadoop is an open source implementation of MapReduce and Google File System, and is currently enjoying wide popularity. Students will modify the task scheduler of Hadoop, conduct several experimental studies, and analyze performance and netwo...

2015

Enter Hadoop, the de facto open source standard that is increasingly being used by many companies in large data migration projects. Hadoop is an open-source framework that allows for the distributed processing of large data sets. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. As data from different sources flows into Hadoop,...

2010
Min Luo Haruo Yokota

Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید