The Precision Issue

Rescaling Scheme (Encoding)

Example

Without scaling

Initial interval → $[0.0, 1.0)$

After encoding A1 → $[0.0, 0.8)$

After encoding A3 → $[0.656, 0.8)$

After encoding A2 → $[0.7712, 0.77408)$

After encoding A2 → $[0.773504, 0.7735616)$

With rescaling:

The ratio between final intervals is: \(\frac{0.2359296}{0.0000576} = 4096 = 2^{12}\)

i.e., the final interval has been enlarged by $2^{\text{number of rescaling applications}}.$ The bits that we have sent during the rescaling process represent the tag itself.


Rescaling Scheme (Decoding)

Note that:

Example 3:

To decode, follow these steps:

  1. Initialize all cumulative distribution function (CDF) values:
    • $F_X(0) = 0.0$
    • $F_X(1) = 0.8$
    • $F_X(2) = 0.82$
    • $F_X(3) = 1.0$
  2. Read the first $n$ bits of the compressed file to initialize the tag.
  3. Initialize the lower and upper limits to 0 and 1, respectively:
    • $\text{lower}^{(0)} = 0$
    • $\text{upper}^{(0)} = 1$
  4. Repeat the following steps until decoding all required symbols:
    • Decode a symbol using: \(F_x(A^{(k)}-1) \leq \frac{\text{tag} - \text{lower}^{(k-1)}}{\text{upper}^{(k-1)} - \text{lower}^{(k-1)}} < F_x(A^{(k)})\)

    • Find the value of $A(k)$ that satisfies the above equation.
    • Adjust the limits using: \(\text{lower}^{(k)} = \text{lower}^{(k-1)} + [\text{upper}^{(k-1)} - \text{lower}^{(k-1)}] \times F_X(A^{(k-1)})\) \(\text{upper}^{(k)} = \text{lower}^{(k-1)} + [\text{upper}^{(k-1)} - \text{lower}^{(k-1)}] \times F_X(A^{(k)})\)
    • If possible, apply the rescaling procedures and adjust the limits and the tag.

So far, we addressed the cases when the interval is entirely confined to:

For the last case, i.e., when the interval is containing the midpoint of the unit interval and the interval is contained in the interval $[1/4, 3/4)$:

Applying the rescaling scheme:

Similar to the floating-point algorithm, but:

Instead of updating the intervals as:

\[\text{lower}^{(k)} = \text{lower}^{(k-1)} + [\text{upper}^{(k-1)} - \text{lower}^{(k-1)}] \times F_X(A^{(k-1)})\] \[\text{upper}^{(k)} = \text{lower}^{(k-1)} + [\text{upper}^{(k-1)} - \text{lower}^{(k-1)}] \times F_X(A^{(k)})\]

Intervals are updated as follows:

\[\text{new_} L = \text{current_} L + \left(\frac{\text{current_} U - \text{current_} L + 1 \times \text{cumulative_count}(\text{symbol} - 1)}{\text{Total_count}}\right)\] \[\text{new_} U = \text{current_} L + \left(\frac{\text{current_} U - \text{current_} L + 1 \times \text{cumulative_count}(\text{symbol})}{\text{Total_count}} - 1\right)\]

The decoded symbol is k, where k is the smallest value that satisfy the following condition:

$ while \left( \left\lfloor \frac{(tag - currentL + 1) \times TotalCount - 1}{currentU - currentL + 1} \right\rfloor \geq cumulativeCount(k) \right) $

E1 during decoding:

E2 during decoding:

E3 during decoding:

Integer Implementation (Encoding)

Integer Implementation (Decoding)

Arithmetic Encoding vs Huffman Encoding