JPEG - Idea and Practice/Appendix 2: Programs for calculating code lengths from the actual picture

From Wikibooks, open books for an open world
Jump to navigation Jump to search

We assume that we have a number (nhv) of (Huffman) values (non-negative integers) which are assigned frequencies (having sum 1), and we order the values according the decreasing frequency. In order to avoid that a code consists only of 1's, we add provisionally a value whose frequency is half (for instance) of the frequency of the last and least value. We call the new number (nhv + 1) of Huffman values nhv, and replace nhv by nhv - 1 when we finally remove a code from the codes of the largest length. We thus have put the values into a one-to-one correnpondance with the natural numbers 1, 2, ..., nhv, and we have an array a[i] from i = 1 to nhv of decreasing frequencies. We let this array of frequencies be the first in an array of arrays of frequencies: a[1, i] = a[i] for i = 1 to nhv. The next array of frequencies a[2, i], constructed from a[1, i] as explained in the section The Huffman coding, is still decreasing and is one shorter than a[1, i]. The last array a[nhv, i] has only one element, namely the frequency 1: a[nhv, 1] = 1.

The values (identified with the natural numbers) 1, 2, ..., nhv, are the first nodes of the Huffman tree, we identify each new constructed node with the succeeding natural numbers nhv+1, hnv+2, .... The node for the frequency a[j, i] is denoted node[j, i], so that node[1, i] = i for i = 1, ..., nhv. Let next[k] (k = 1, ..., 256) be an array (of non-negative integers) initially set to 0, and to be constructed so that next[k] is the end-node for the line from the node k. The program that calculates the two double arrays a[j, i] and node[j, i] (of frequencies and nodes, respectively) and (from node[j, i]) the array next[k] (of next nodes), can look like this:

n = nhv
m = n
for i = 1 to n do
node[1, i] = i
i = 1

0

m = m + 1
next[node[i, n - 1]] = m
next[node[i, n]] = m
j = 1
e = a[i, n - 1] + a[i, n]
if e > a[i, 1] then
j = 1
else
while (e <= a[i, j]) and (j <= n) do
j = j + 1
i = i + 1
n = n - 1
if j > 1 then
for k = 1 to j - 1 do
begin
a[i, k] = a[i - 1, k]
node[i, k] = node[i - 1, k]
end
a[i, j] = e
node[i, j] = m
if j < n then
for k = 1 to n - j do
begin
a[i, j + k] = a[i - 1, j - 1 + k]
node[i, j + k] = node[i - 1, j - 1 + k]
end
if n > 1 then
goto 0

The array codesize[k] which for each value k (k = 1, ..., nhv) states the code length (= number of lines from k to the end-note having frequency 1), can be calculated (from next[k]) by this program:

for k = 1 to nhv do
begin
j = 0
i = k
while i > 0 do
begin
i = next[i]
j = j + 1
end
codesize[k] = j - 1
end

We can assume that no (Huffman) value has so small frequency that its code length is greater than 32. The array bits[i] stating for each number i from 1 to 32 the number of values k having codesize[k] = i, can be calculated by this program:

i = 0
while i < 32 do
begin
i = i + 1
bits[i] = 0
j = 0
while j < 255 do
begin
j = j + 1
if codesize[j] = i then
bits[i] = bits[i] + 1
end
end

As no code length must exceed 16, the array bits[i] must possibly be revised. This can be done by this procedure (explained in the section The Huffman coding):

i = 32

0

if bits[i] > 0 then
begin
j = i - 1
j = j - 1
while bits[j] = 0 do
j = j - 1
bits[i] = bits[i] - 2
bits[i - 1] = bits[i - 1] + 1
bits[j + 1] = bits[j + 1] + 2
bits[j] = bits[j] - 1
goto 0
end
else
begin
i = i - 1
if i > 16 then
goto 0
while bits[i] = 0 do
i = i - 1
bits[i] = bits[i] - 1
end
nhv = nhv - 1

The operations bits[i] = bits[i] - 1 and nhv = nhv - 1 are the removal of the provisionally code consisting of only 1's. This array bits[i] (i = 1, ..., 16) is the list BITS, and we get the list HUFFVAL by diving the set {1, 2, ..., nhv} up according to bits[i]: if i1 is the first i such that bits[i] > 0, the first part is the first bits[i1] numbers of {1, 2, ..., nhv}, if i2 is the next i such that bits[i] > 0, the next part is the next bits[i2] numbers of {1, 2, ..., nhv}, etc. The array HUFFVAL[k] (k = 1, ..., nhv) is the sequence of values which we have put into a one-to-one correspondance with 1, 2, ..., nhv.

For a colour picture we must have four sets of Huffman values with associated frequencies: for the DC and for the AC numbers of the Y component, and for the DC and for the AC numbers of the colour components. We get these four sets by performing a pre-scanning of the picture: we let an 8x8-square run through the picture, and for the DC numbers of the Y component, for instance, we register the numbers size(diff) that appear and calculate for each of these its frequency. In this case the possible Huffman values are the numbers 0, 1, ..., 11, and if these appear respectively n0, n1, ..., n11 times, and the number of 8x8-squares is N, then the frequencies are the numbers n0/N, n1/N, ..., n11/N.

Finally we show the program which can order a sequence of (Huffman) values with attached frequencies according to decreasing frequency and count those of non-zero frequency (that is, find the number nhv of Huffman values). The maximum possible value is called max (it is 11 for the DC values and 250 for the AC values). The original and the new function is called freq0[val] and freq[val], respectively (they are arrays of reals from 0 to max). per[i] is an array from 0 to max of integers which performs the permutation of the values:

for i = 0 to max do
per[i] = -1
m = 0
while m <= max do
begin
e = 0;
for i = 0 to max do
begin
z = 0
j = 0
while (j <= max) and (z = 0) do
begin
if i = per[j] then
z = 1
j = j + 1
end
if (z = 0) and (freq0[i] >= e) then
begin
k = i
e = freq[i]
end
end
per[m] = k
m = m + 1
end
j = 0
for i = 0 to max do
if freq0[per[i]] > 0 then
begin
j = j + 1
huffval[j] = per[i]
freq[j] = freq0[per[i]]
end
nhv = j

We have made a version (CJPEGg_huf) of our program (CJPEGg) which can produce a grey scale file and in which we perform a pre-scanning that calculates frequencies from which we construct Huffman tables. For the DC values we have an array freqc[val] of integers (val = size(diff)) and an integer lc, both starting with 0, and which for each new value val we meet are increased by 1. When the pre-scanning is finished, the frequency of val is freqc[val]/lc. The same applies for the AC values (val = m*16 + k or 240 or 0).

We will find the Huffman values for three simple grey scale pictures of 200x200 pixels:

The first is of only one colour, namely the middle grey value 128, corresponding to the signed byte 0. There is only one DC Huffman value and one AC Huffman value, namely 0 having frequency 1. The picture is divided up in 625 8x8-squares, and for each of these the encoded data takes up 2 bits. In total 1250 bits = 157 bytes after padding with 6 bits. The header takes up 156 bytes and the file ends with the two bytes EOF, therefore the file takes up 156 + 2 + 157 = 315 bytes.

The second picture (the left below) is of two colours. There are three DC Huffmann values: 0 with frequency 0.8816, 6 with frequency 0.08 and 7 with frequency 0.0384. There are five AC Huffman values: the first 0 with frequency 0.86..., the second 194 with frequency 0.03.... The reason for the non-zero AC values is that the vertical division line lies inside some of the 8x8-squares. The file takes up 485 bytes.

The third picture (the right below) is also of two colours. The division is coincident with the division in 8x8-squares, so that there are 625 of these small pictures. We have in this case set all the quantization numbers to 1 (quality = 100 per cent). As all the 8x8-squares are identical, there are only two DC Huffman values: 0 and a value used only for the first square, and thus having frequency 1/625 = 0.0016. The two colours are black and white, having colour values (as signed bytes) -128 and 127, respectively, and the average value is -16.5 (because there is a little more black than white). Therefore the first DC number is 8*(-16.5) = -132, having size 8, which is the non-zero DC Huffman value. The Huffman value 0 is assigned code word "0" and the Huffman value 8 is assigned code word "10", therefore the DC part of the encoded data for the first 8x8-square takes up 2+8 = 10 bits, and the others 1 bit. All the AC parts of the encoded data for the 8x8-squares are identical and take up 386 bits. In total the encoded data should take up 1*(10 + 386) + 624*(1 + 386) = 241884 bits = 30236 bytes after padding with 4 bits. The header takes up 172 bytes and the file ends with the two bytes EOF, therefore the file should take up 30236 + 172 + 2 = 30410 bytes. But in reality it takes up 31192 bytes - 782 bytes more. The reason for the difference is that the byte 255 (8 figures 1) has appeared 782 times in the running conversion of 8-blocks of bits into bytes, and thus has been followed by the zero byte.

The condition that no code must consist only of 1's, seems not to be strictly necessary: if we omit it, some image programs accept the file (Paint and Internet Explorer, for instance), but some do not (the image shower of Windows and Adobe Photoshop, for instance).

The procedure which limits the length of the code words to 16, can of course only come into play for the AC values and it presupposes that the picture has a certain size and variation of colours, but the operation of it is not a seldom phenomenon: the examples of Difficult pictures in part one (of only 400 pixels) activate the procedure.