No navigation frames? Click here
Site last updated:  09/04/05
This page last updated:  09/25/04

next up previous
Next: The tables and potential Up: A faster way to Previous: A faster way to

Introduction

The algorithm that is employed in the JPEGLib for downscaling to a fourth during decoding uses the following matrix-vector product, which is derived from the full IDCT from Loeffler, Ligtenberg and Moschytz:
$\displaystyle {\textstyle\frac{1}{4}}$   $\displaystyle \begin{pmatrix}
f(0)+f(1) + f(2)+f(3)\\
f(4)+f(5) + f(6)+f(7)\\
\end{pmatrix}$ = $\displaystyle {\frac{1}{4 \sqrt 2}}$ . $\displaystyle \begin{pmatrix}
1&1\\
1&-1\\
\end{pmatrix}$ (1)
    . $\displaystyle \begin{pmatrix}
2&0&0&0&0\\
0&C_7-C_5&C_7-C_3&C_1-C_5&C_1+C_3
\end{pmatrix}$ . $\displaystyle \begin{pmatrix}
\tilde{F}(0)\\
\tilde{F}(7)\\
\tilde{F}(3)\\
\tilde{F}(5)\\
\tilde{F}(1)\\
\end{pmatrix}$ (2)

In this matrix-vector product, $ \tilde{F}$(n) is the DCT coefficient at index n and f (n) is the pixel value in the spatial domain at position n. The constants Cn that are used are defined as follows:

Ck = cos$\displaystyle {\frac{k \pi}{16}}$,   k = 0,..., 7 (3)

The structure of the algorithm can probably better be explained as a flowgraph as in figure 1.

Figure 1: Flowgraph for the IDCT resizing to a fourth of the original size
\includegraphics[width=\textwidth]{llm2size}

What can be found surprising in this structure is the fact, that all multiplicative constants scale the DCT coefficients, much like the scaling coefficients of the Arai-Agui-Nakajima DCT. The JPEGLib accounts for the Arai-Agui-Nakajima DCT's coefficients in that these are absorbed into the dequantization coefficients. This is what makes the Arai-Agui-Nakajima DCT so fast in comparison to other schemes. Unfortunately, the JPEGLib does not use this approach for the algorithm that scales to a fourth. If this were done in the JPEGLib, scaling to a fourth would in theory simply consist out of additions. The author made an implementation for this inside the JPEGLib and it actually turned out that decoding is much faster this way since now only additions and shift operations are involved. Unfortunately, it also turned out that there are dependencies on the bitness of the platform being used. The scaling constants over both dimensions when represented as real numbers look like the following:  

4.000000, 3.624510, 2.000000, 1.272759, 2.000000, 0.850430, 2.000000, 0.720960,
3.624510, 3.284268, 1.812255, 1.153281, 1.812255, 0.770598, 1.812255, 0.653281,
2.000000, 1.812255, 1.000000, 0.636379, 1.000000, 0.425215, 1.000000, 0.360480,
1.272759, 1.153281, 0.636379, 0.404979, 0.636379, 0.270598, 0.636379, 0.229402,
2.000000, 1.812255, 1.000000, 0.636379, 1.000000, 0.425215, 1.000000, 0.360480,
0.850430, 0.770598, 0.425215, 0.270598, 0.425215, 0.180808, 0.425215, 0.153281,
2.000000, 1.812255, 1.000000, 0.636379, 1.000000, 0.425215, 1.000000, 0.360480,
0.720960, 0.653281, 0.360480, 0.229402, 0.360480, 0.153281, 0.360480, 0.129946
 

With this table, first all dequantization table values must be scaled and afterwards two passes over the dequantized DCT coefficients have to be made which in theory only consist out of additions: In the first pass we perform the algorithm over the columns 0, 1, 3, 5 and 7, each time doing the additions like in the flowgraph. This yields two values per each of these 5 columns and thus two rows with 5 columns of interest. The second pass performs only the additions like in the flowgraph, but this time over the two rows that hold the result of the first pass. In order to implement this table with fixed-point arithmetics, careful observation of potential overflow when scaling the quantization table and when adding and subtracting is required. This leads to six different tables in this new implementation that arise from dependencies on the platform's bitness, the size of a sample (8 bits or 12 bits) and also allow a speed versus accuracy tradeoff. In order to implement the functionality, the new macro USE_FASTER_2x2_IDCT was introduced which simply needs to be defined in jconfig.h or jmorecfg.h like this:

#define USE_FASTER_2x2_IDCT

If this macro is not defined, the standard functionality from JPEGLib version 6b is used. Depending on this macro, additional code in jddctmgr.c multiplies the dequantization constants with a scaled variant of the table above and in jidctred.c an alternative implementation of jpeg_idct_2x2 gets compiled. The dependence on the bitness of the platform is automatically resolved by examining the value of the constant INT_MAX from limits.h. If INT_MAX evaluates to 32767, it is a 16-bit platform, otherwise a 32-bit platform or higher is expected.

In order to use the fastest possible mode (with the least accuracy), additionally the macro USE_INACCURATE_IDCT can be defined in jconfig.h or jmorecfg.h like this:

#define USE_INACCURATE_IDCT

Depending on the platform's bitness, this macro has different functionality:

  • 16-bit: If USE_INACCURATE_IDCT is defined, all multiplications and additions yield results that are less than 215, so only the (16-bit) int data type is used. Nevertheless, in this ``mode'', a right shift operation is required after the first pass in order to avoid a potential overflow in the second pass.
  • 32-bit: If USE_INACCURATE_IDCT is defined, the table is chosen in such a way, that no right shift operation after the first pass is required. For this ``mode'' the table that is used must not lead to an overflow in the second pass and is therefore less accurate than with USE_INACCURATE_IDCT undefined
If USE_INACCURATE_IDCT is undefined, the tables are chosen in such a way, that no overflow happens during the scaling of the dequantization tables and during the additions in both passes of the algorithm as will be shown later.
 






Copyright © Stefan Kuhr 1999-2004. All rights reserved.
This website contains links to other websites. The owner of this site is not responsible for the content of these sites.
Visits:
Warning: mkdir(): Permission denied in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: mkdir(): No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: mkdir(): No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: mkdir(): No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: mkdir(): No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: mkdir(): No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 32

Warning: fopen(/counter/www/htdocs/stefanku/jpeg/jpeg2x2/node2.php3.count): failed to open stream: No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 37

Warning: fopen(/counter/www/htdocs/stefanku/jpeg/jpeg2x2/node2.php3.count.): failed to open stream: No such file or directory in /www/htdocs/stefanku/php/count.php3 on line 46
1