نتایج جستجو برای: hadoop

تعداد نتایج: 2553  

2014
Wei-Chun Chung Chien-Chih Chen Jan-Ming Ho Chung-Yen Lin Wen-Lian Hsu Yu-Chun Wang D. T. Lee Feipei Lai Chih-Wei Huang Yu-Jung Chang

BACKGROUND Explosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via a network to computers in the cloud to substantially reduce computational latency. Hadoop/...

2014
K. Ashwin Kumar Jonathan Gluck Amol Deshpande Jimmy J. Lin

The underlying assumption behind Hadoop and, more generally, the need for distributed processing is that the data to be analyzed cannot be held in memory on a single machine. Today, this assumption needs to be re-evaluated. Although petabyte-scale datastores are increasingly common, it is unclear whether “typical” analytics tasks require more than a single high-end server. Additionally, we are ...

2016
Vivek Badhe Shweta Verma

Figuring innovation has changed the way we work, concentrate on, and live. The appropriated information preparing innovation is one of the mainstream themes in the IT field. It gives a straightforward and concentrated registering stage by lessening the expense of the equipment. The attributes of circulated information preparing innovation have changed the entire business. Hadoop, as the open so...

2015
Jiaqui Tan Xinghao Pan Soila Kavulya Rajeev Gandhi Priya Narasimhan Jiaqi Tan

Mochi, a new visual, log-analysis based debugging tool correlates Hadoop’s behavior in space, time and volume, and extracts a causal, unified controland data-flow model of Hadoop across the nodes of a cluster. Mochi’s analysis produces visualizations of Hadoop’s behavior using which users can reason about and debug performance issues. We provide examples of Mochi’s value in revealing a Hadoop j...

2014
Amogh Pramod Kulkarni Mahesh Khandewal

Big Data, the analysis of large quantities of data to gain new insight has become a ubiquitous phrase in recent years. Day by day the data is growing at a staggering rate. One of the efficient technologies that deal with the Big Data is Hadoop, which will be discussed in this paper. Hadoop, for processing large data volume jobs uses MapReduce programming model. Hadoop makes use of different sch...

2010

Intel is a major contributor to open source initiatives, such as Linux*, Apache*, and Xen*, and has also devoted resources to Hadoop analysis, testing, and performance characterizations, both internally and with fellow travelers such as HP and Cloudera. Through these technical efforts, Intel has observed many practical trade-offs in hardware, software, and system settings that have real-world i...

2011
Jinquan Dai Jie Huang Shengsheng Huang Bo Huang Yan Liu

Although Big Data Cloud (e.g., MapReduce, Hadoop and Dryad) makes it easy to develop and run highly scalable applications, efficient provisioning and finetuning of these massively distributed systems remain a major challenge. In this paper, we describe a general approach to help address this challenge, based on distributed instrumentations and dataflow-driven performance analysis. Based on this...

2010
T Auntin Jose

Hadoop, an open source java framework deals with big data. It has HDFS (Hadoop distributed file system) and MapReduce. HDFS is designed to handle large amount files through clusters and suffers performance penalty while dealing with large number of small files. These large numbers of small files pose a heavy burden on the NameNode of HDFS and an increase execution time for MapReduce. Secondly, ...

Journal: :OJBD 2015
Ujjal Marjit Kumar Sharma Puspendu Mandal

Hadoop is an open source framework for processing large amounts of data in distributed computing environment. It plays an important role in processing and analyzing the Big Data. This framework is used for storing data on large clusters of commodity hardware. Data input and output to and from Hadoop is an indispensable action for any data processing job. At present, many tools have been evolved...

2017
Sumukhi Chandrashekar Lihao Xu

Hadoop platform is widely being used for managing, analyzing and transforming large data sets in various systems. Two basic components of Hadoop are: 1) a distributed file system (HDFS) 2) a computation framework (MapReduce). HDFS stores data on simple commodity machines that run DataNode processes (DataNodes). A commodity machine running NameNode process (NameNode) maintains meta data informat...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید