Comparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications

نویسندگان

Min Luo

Haruo Yokota

چکیده

Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications do not benefit from clusters with centered metanode, like Hadoop. In this paper, we compare our Fat-Btree based data access method, which excludes center node in clusters, with Hadoop. We show their different performance in different file I/O applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Hmfs: Efficient Support of Small Files Processing over HDFS

The storage and access of massive small files are one of the challenges in the design of distributed file system. Hadoop distributed file system (HDFS) is primarily designed for reliable storage and fast access of very big files while it suffers a performance penalty with increasing number of small files. A middleware called Hmfs is proposed in this paper to improve the efficiency of storing an...

متن کامل

Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks

TheHadoopDistributed File System (HDFS) is designed to run on commodity hardware and can be used as a stand-alone general purpose distributed file system (Hdfs user guide, 2008). It provides the ability to access bulk data with high I/O throughput. As a result, this system is suitable for applications that have large I/O data sets. However, the performance of HDFS decreases dramatically when ha...

متن کامل

GFS-Btree: A Scalable Peer-to-Peer Overlay Network for Lookup Service

A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. We propose a new scalable Peer-to-Peer overlay network, GFS-Btree, which is resembling a Btree network. By adding additional linkages to Btree, GFS-Btree can relieve the congestion at the root and the other branch nodes. The events of a node joining and leaving th...

متن کامل

An Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework

The most popular open source distributed computing framework called Hadoop was designed by Doug Cutting and his team, which involves thousands of nodes to process and analyze huge amounts of data called Big Data. The major core components of Hadoop are HDFS (Hadoop Distributed File System) and MapReduce. This framework is the most popular and powerful for store, manage and process Big Data appl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Comparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications

نویسندگان

چکیده

منابع مشابه

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hmfs: Efficient Support of Small Files Processing over HDFS

Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks

GFS-Btree: A Scalable Peer-to-Peer Overlay Network for Lookup Service

An Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework

عنوان ژورنال:

اشتراک گذاری