Redblock: a tool for online deduplication on large datasets
نویسندگان
چکیده
منابع مشابه
Generating Realistic Datasets for Deduplication Analysis
Deduplication is a popular component of modern storage systems, with a wide variety of approaches. Unlike traditional storage systems, deduplication performance depends on data content as well as access patterns and meta-data characteristics. Most datasets that have been used to evaluate deduplication systems are either unrepresentative, or unavailable due to privacy issues, preventing easy com...
متن کاملECplot: an online tool for making standardized plots from large datasets for bioinformatics publications
MOTIVATION AND RESULTS We have implemented ECplot, an online tool for plotting charts from large datasets. This tool supports a variety of chart types commonly used in bioinformatics publications. In our benchmarking, it was able to create a Box-and-Whisker plot with about 67 000 data points and 8 MB total file size within several seconds. The design of the tool makes common formatting operatio...
متن کاملOnline Deduplication for Distributed Databases
The rate of data growth outpaces the decline of hardware costs, and there has been an ever-increasing demand in reducing the storage and network overhead for online database management systems (DBMSs). The most widely used approach for data reduction in DBMSs is blocklevel compression. Although this method is simple and effective, it fails to address redundancy across blocks and therefore leave...
متن کاملOnline Projective Nonnegative Matrix Factorization for Large Datasets
Projective Nonnegative Matrix Factorization (PNMF) is one of the recent methods for computing low-rank approximations to data matrices. It is advantageous in many practical application domains such as clustering, graph partitioning, or sparse feature extraction. However, up to now a scalable implementation of PNMF for large-scale machine learning problems has been lacking. Here we provide an on...
متن کاملOnline survey software as a data collection tool for medical education: A case study on lesson plan assessment
Background: There are no general strategies or tools to evaluate daily lesson plans; however, assessments conducted using traditional methods usually include course plans. This study aimed to evaluate the strengths and weaknesses of online survey software in collecting data on education in medical fields and the application of such softwares to evaluate students' views and modification of lesso...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Revista Brasileira de Computação Aplicada
سال: 2017
ISSN: 2176-6649
DOI: 10.5335/rbca.v9i2.7143