Guidelines for Using Compare-by-hash
نویسندگان
چکیده
Recently, a new technique called compare-by-hash has become popular. Compare-by-hash is a method of content-based addressing in which data is identified only by the cryptographic hash of its contents. Hash collisions are ignored, with the justification that they occur less often than many kinds of hardware errors. Compare-by-hash is a powerful, versatile tool in the software architect’s bag of tricks, but it is also poorly understood and frequently misused. The consequences of misuse range from significant performance degradation to permanent, unrecoverable data corruption or loss. The proper use of compare-by-hash is a subject of debate[10, 29], but recent results in the field of cryptographic hash function analysis, including the breaking of MD5[28] and SHA-0[12] and the weakening of SHA-1[3], have clarified when compare-by-hash is appropriate. In short, compare-by-hash is appropriate when it provides some benefit (performance, code simplicity, etc.), when the system can survive intentionally generated hash collisions, and when hashes can be thrown away and regenerated at any time. In this paper, we propose and explain some simple guidelines to help software architects decide when to use compare-by-hash.
منابع مشابه
An Improved Hash Function Based on the Tillich-Zémor Hash Function
Using the idea behind the Tillich-Zémor hash function, we propose a new hash function. Our hash function is parallelizable and its collision resistance is implied by a hardness assumption on a mathematical problem. Also, it is secure against the known attacks. It is the most secure variant of the Tillich-Zémor hash function until now.
متن کاملCompressed Image Hashing using Minimum Magnitude CSLBP
Image hashing allows compression, enhancement or other signal processing operations on digital images which are usually acceptable manipulations. Whereas, cryptographic hash functions are very sensitive to even single bit changes in image. Image hashing is a sum of important quality features in quantized form. In this paper, we proposed a novel image hashing algorithm for authentication which i...
متن کاملPlagiarism checker for Persian (PCP) texts using hash-based tree representative fingerprinting
With due respect to the authors’ rights, plagiarism detection, is one of the critical problems in the field of text-mining that many researchers are interested in. This issue is considered as a serious one in high academic institutions. There exist language-free tools which do not yield any reliable results since the special features of every language are ignored in them. Considering the paucit...
متن کاملAn Analysis of Compare-by-hash
Recent research has produced a new and perhaps dangerous technique for uniquely identifying blocks that I will call compare-by-hash. Using this technique, we decide whether two blocks are identical to each other by comparing their hash values, using a collision-resistant hash such as SHA-1[5]. If the hash values match, we assume the blocks are identical without further ado. Users of compare-by-...
متن کاملHash challenges: Stretching the limits of compare-by-hash in distributed data deduplication
We propose a technique for reducing communication overheads when sending data across a network. Our technique, called hash challenges, leverages existing deduplication solutions based on compare-by-hash by being able to determine redundant data chunks by exchanging substantially less meta-data. Hash challenges can be used directly on any existing compare-by-hash protocol, with no relevant addit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004