There is a class of entropy-coding methods which do not substitute symbols by code words (such as Huffman coding), but operate on intervals or ranges and thus allow a better approximation of the data entropy. This class includes three prominent members: conventional arithmetic coding, range coding, and coding based on asymmetric numeral systems. To determine the correct symbol in the decoder, each of these methods requires the comparison of a state variable with subinterval boundaries.
In adaptive operation, considering varying symbol statistics, an array of interval boundaries must additionally be kept up to date. The larger the symbol alphabet, the more time-consuming both the search for the correct subinterval and the updating of interval borders become. These entropy coding methods play an important role in all data transmission and storage applications, and optimising speed can be crucial.
Based on detailed pseudo-code, different known and proposed approaches are discussed to speed up the symbol search in the decoder and the adaptation of the array of interval borders, both depending on the chosen alphabet size. It is shown that reducing the big O complexity in practical implementations does not necessarily lead to an acceleration, especially if the alphabet size is too small. For example, the symbol determination at the decoder shows an expected low cpu-clock ratio (O(logn) algorithm versus O(n) algorithm) of about 0.62 for an alphabet with 256 symbols. However, for an alphabet with only 4 symbols, this ratio is 1.05, that means the algorithm with lower theoretical complexity executes slightly faster here. In adaptive compression mode, the binary indexing (BI) method proves to be superior when considering the overall processing time. Although the symbol search (in the decoder) takes longer than using other algorithms (e.g. cpu-clock ratio BI/O(logn) is 1.57), the faster updating of the array of interval borders more than compensates for this disadvantage (total ratio BI/O(logn) is 0.72). A variant of the binary indexing method is proposed, which is more flexible and has a partially lower complexity than the original approach. Specifically, the rescaling of cumulative counts can be reduced in its complexity from O(4n+[log2(n)−2]·n/2) to O(3n).
mehrTitel | Investigations on algorithm selection for interval-based coding methods |
---|---|
Medien | Multimedia Tools and Applications |
Verfasser | Prof. Dr.-Ing. habil. Tilo Strutz, Nico Schreiber |
Veröffentlichungsdatum | 03.07.2025 |
Projekttitel | SCDKompri / SCF |
Zitation | Strutz, Tilo; Schreiber, Nico (2025): Investigations on algorithm selection for interval-based coding methods. Multimedia Tools and Applications. DOI: 10.1007/s11042-025-20971-3 |