Enhanced Extraction from Huffman Encoded Files
نویسندگان
چکیده
Given a file T , and the Huffman encoding of its elements, we suggest using a pruning technique for Wavelet trees that enables direct access to the i-th element of T by reordering the bits of the compressed file and using some additional space. When compared to a traditional Wavelet tree for Huffman Codes, our different reordering of the bits usually requires less additional storage overhead by reducing the need for auxiliary rank structures, while improving processing time for extracting the i-th element of T .
منابع مشابه
Parallel Huffman Decoding with Applications to JPEG Files
A simple parallel algorithm for decoding a Huffman encoded file is presented, exploiting the tendency of Huffman codes to resynchronize quickly, i.e. recovering after possible decoding errors, in most cases. The average number of bits that have to be processed until synchronization is analyzed and shows good agreement with empirical data. As Huffman coding is also a part of the JPEG image compr...
متن کاملBetter Huffman Coding via Genetic Algorithm
We present an approach to compress arbitrary files using a Huffman-like prefix-free code generated through the use of a genetic algorithm, thus requiring no prior knowledge of substring frequencies in the original file. This approach also enables multiple-character substrings to be encoded. We demonstrate, through testing on various different formats of real-world data, that in some domains, th...
متن کاملSpeeding Up String Pattern Matching by Text Compression: The Dawn of a New Era
This paper describes our recent studies on string pattern matching in compressed texts mainly from practical viewpoints. The aim is to speed up the string pattern matching task, in comparison with an ordinary search over the original texts. We have successfully developed (1) an AC type algorithm for searching in Huffman encoded files, and (2) a KMP type algorithm and (3) a BM type algorithm for...
متن کاملA Lossless re-Encoding of MPEG-2 Coded file by Integrating Four Motion Vectors
Re-encoding of once compressed files is one of the difficult challenges in measuring the efficiency of coding methods. Variable length coding with a variable source delimiting scheme is a promising method for improving re-encoding efficiency. Analyses of coded files with fixed length delimiting and with variable length delimiting are reviewed. Motion vector codes of MPEG-2 encoded files are mod...
متن کاملIdentification and Recovery of JPEG Files with Missing Fragments
Recovery of fragmented files proves to be a challenging task for encoded files like JPEG. In this paper, we consider techniques for addressing two issues related to fragmented JPEG file recovery. First issue concerns more efficient identification of the next fragment of a file undergoing recovery. Second issue concerns the recovery of file fragments which cannot be linked to an existing image h...
متن کامل