Redblock: a tool for online deduplication on large datasets

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Realistic Datasets for Deduplication Analysis

Deduplication is a popular component of modern storage systems, with a wide variety of approaches. Unlike traditional storage systems, deduplication performance depends on data content as well as access patterns and meta-data characteristics. Most datasets that have been used to evaluate deduplication systems are either unrepresentative, or unavailable due to privacy issues, preventing easy com...

متن کامل

ECplot: an online tool for making standardized plots from large datasets for bioinformatics publications

MOTIVATION AND RESULTS We have implemented ECplot, an online tool for plotting charts from large datasets. This tool supports a variety of chart types commonly used in bioinformatics publications. In our benchmarking, it was able to create a Box-and-Whisker plot with about 67 000 data points and 8 MB total file size within several seconds. The design of the tool makes common formatting operatio...

متن کامل

Online Deduplication for Distributed Databases

The rate of data growth outpaces the decline of hardware costs, and there has been an ever-increasing demand in reducing the storage and network overhead for online database management systems (DBMSs). The most widely used approach for data reduction in DBMSs is blocklevel compression. Although this method is simple and effective, it fails to address redundancy across blocks and therefore leave...

متن کامل

Online Projective Nonnegative Matrix Factorization for Large Datasets

Projective Nonnegative Matrix Factorization (PNMF) is one of the recent methods for computing low-rank approximations to data matrices. It is advantageous in many practical application domains such as clustering, graph partitioning, or sparse feature extraction. However, up to now a scalable implementation of PNMF for large-scale machine learning problems has been lacking. Here we provide an on...

متن کامل

Online survey software as a data collection tool for medical education: A case study on lesson plan assessment

Background: There are no general strategies or tools to evaluate daily lesson plans; however, assessments conducted using traditional methods usually include course plans. This study aimed to evaluate the strengths and weaknesses of online survey software in collecting data on education in medical fields and the application of such softwares to evaluate students' views and modification of lesso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Revista Brasileira de Computação Aplicada

سال: 2017

ISSN: 2176-6649

DOI: 10.5335/rbca.v9i2.7143