JPEG - Idea and Practice/Program for making a grey scale file

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Now for the program that can produce a grey scale JPEG file. We assume that the width and the height of the picture are divisible by 8, and set wid8 = width div 8 and hei8 = height div 8. And we assume that the colour values are given in form of a memory-block pb of a bitmap, so that the colour value (byte) of the point having screen coordinate set (i, j) (i = 0, ..., width-1, j = 0, ..., height-1), is pb[(height-1 - j) * width + i]. More precisely: we assume that the picture is given as a colour picture in BMP format, and we use only the part of it lying within the largest domain that can be regularily divided up in 8x8-squares, and we construct pb by taking the average of the RGB values.

We have written the markers (and their segments) in this order: SOI, APP, DQT, DQT, SOF, DHT, DHT, SOS (there are two of the DQT and DHT segments, because there are two quantization tables and two Huffman codings). The last segment - SOS - marks the beginning of the stream of the encoded data, and after this the file closes with the marker EOI. We have for the DC and for the AC numbers calculated the arrays EHUFSI[val] and EHUFCO[val][i] of the size of the code assigned to the Huffman value val and the code itself. In the program these arrays are called ehufsid[val] and ehufcod[val] for the DC numbers, and ehufsia[val] and ehufcoa[val] for the AC numbers.

For the 8x8-square having coordinate set (i0, j0) (i0 = 0, ..., wid8-1, j0 = 0, ..., hei8-1) and for the point in the square having coordinate set (i, j) (i, j = 0, ..., 7), the screen coordinate set is (i0 * 8 + i, j0 * 8 + j). For each 8x8-square we have an 8x8-matrix f of colour values (signed bytes - we have subtracted 128 from the original colour values (level shift) in order to get smaller numerical values), and by discrete cosine transform and quantization and round off, we get an 8x8-matrix g(u, v) of integers. This procedure (or rather, function) is called costrans(f): g = costrans(f). The inverse of the zigzag transform (iz: (i, j) → [1, ..., 64]) is composed of two arrays zx[l] and zy[l] from 1 to 64 (so that the zigzag transform of (zx[l], zy[l]) is l), and by this g is converted to a 64-array w (from 1 to 64). w[1] is the DC number, from this we subtract the preceding DC number (stored in the variable dc) getting the difference diff. We get the binary digit expression of an integer n by our function digit(n), and this array (from 1 to size(n)) is inserted in the variable c array (from 1 to 10). The procedure that writes the bit (which is of the form c[j], where c is either a code word or a digit expression) into the file (called fu) is denoted wbit(bit) - the (global) variables b0, b and q are used in this procedure. The programs for costrans and wbit are shown after the program for the scanning procedure:

b0 = 0
b = 0
q = 256
dc = 0
for j0 = 0 to hei8 - 1 do
for i0 = 0 to wid8 - 1 do
begin
for j = 0 to 7 do
for i = 0 to 7 do
f[i, j] = pb[(height - 1 - (j0 * 8 + j)) * width + (i0 * 8 + i)] - 128
g = costrans(f)
for l = 1 to 64 do
w[l] = g[zx[l], zy[l]]
diff = w[1] - dc
dc = w[1]
val = size(diff)
e = ehufsid[val]
c = ehufcod[val]
for j = 1 to e do
wbit(c[j])
if diff <> 0 then
begin
c = digit(diff)
for j = 1 to val do
wbit(c[j])
end
r = 64
while (r > 1) and (w[r] = 0) do
r = r - 1
if r > 1 then
begin
l = 1
m = 0
while l < r do
begin
l = l + 1
n = w[l]
if n = 0 then
begin
m = m + 1
if m = 16 then
begin
e = ehufsia[240]
c = ehufcoa[240]
for j = 1 to e do
wbit(c[j])
m = 0
end
end
else
begin
k = size(n)
val = m * 16 + k
e = ehufsia[val]
c = ehufcoa[val]
for j = 1 to e do
wbit(c[j])
c = digit(n)
for j = 1 to k do
wbit(c[j])
m = 0
end
end
end
if r < 64 then
begin
e = ehufsia[0]
c = ehufcoa[0]
for j = 1 to e do
wbit(c[j])
end
end

The program for the function costrans(f), which cosine transform and quantize the 8x8-matrix f[i, j] (of signed bytes) giving the 8x8-matrix g[u, v] (of integers), is divided up in four cases: u = 0 and v = 0, u = 0 and v > 0, u > 0 and v = 0 and u > 0 and v > 0. If the 64-array of the quantization table is called quant[k] and the zigzag function is called iz(i, j), we have beforehand calculated the matrix cq[i, j] = 4 * quant[iz(i, j)] (i, j = 0, 1, ..., 7) (of integers) and the matrix cs[i, j] = cos((2 * i + 1) * j * pi / 16) (i, j = 0, 1, ..., 7) (of reals). The programs for the four cases of g[u, v] can look like this:

s = 0
for i = 0 to 7 do
for j = 0 to 7 do
s = s + f[i, j]
g[0, 0] = round(s / (2 * cq[0, 0]))
for v = 1 to 7 do
begin
s = 0
for j = 0 to 7 do
begin
t = 0
for i = 0 to 7 do
t = t + f[i, j]
s = s + cs[j, v] * t
end
g[0, v] = round(s / (sqrt(2) * cq[0, v]))
end
for u = 1 to 7 do
begin
s = 0
for i = 0 to 7 do
begin
t = 0
for j = 0 to 7 do
t = t + f[i, j]
s = s + cs[i, u] * t
end
g[u, 0] = round(s / (sqrt(2) * cq[u, 0]))
end
for u = 1 to 7 do
for v = 1 to 7 do
begin
s = 0
for i = 0 to 7 do
begin
t = 0
for j = 0 to 7 do
t = t + cs[j, v] * f[i, j]
s = s + cs[i, u] * t
end
g[u, v] = round(s / cq[u, v])
end

Finally the procedure wbit(bit) that writes the bit "bit" (defined as a byte, since a program does not deal with bits) into the file fu. We get the bits from code words or from the digits of numbers, and before the insertion in the file these are collected in 8-blocks which are converted to bytes. We call the current byte b (initially set to 0), and if we have an integer q which starts with 256 and which before each insertion of the bit in b is divided by 2, then the addition of the (new) bit means that b must be increased with bit * q: b = b + bit * q. When q = 1, b is written into the file and q is again set to 256. If b = 255 (8 figures 1), the writing must be followed by the writing of the zero byte b0 (8 figures 0)(byte stuffing), so that 255 (during the decoding) is not mistaken for the beginning of a marker. The writing procedure wbit could look like this:

procedure wbit(bit: byte)
begin
q = q div 2
b = b + bit * q
if q = 1 then
begin
write(fu, b)
if b = 255 then
write(fu, b0)
b = 0
q = 256
end
end

The program ends with this procedure that writes the last byte b if q is not set to 256 (indicating that b is not yet written), setting the rest of the bits of b to 1 (bit padding):

e = size(q) - 1
p = 1
for i = 1 to e do
begin
b = b + p
p = 2 * p
end
write(fu, b)

If the last byte b is 255, it must be followed by the zero byte b0. At the very end we write the marker EOI = (255, 217) (end of image) and close the file.