I want to make a JPEG where for each of the 3 components (Y, Cb, Cr), you encode a 8x8 block one after another, and then move to the next 8x8 block in the image.
E.X. A 16x16 image exists. write header (is there anything special I need to mark? I opened a known jpeg to confirm I was writing quantization tables and Huffman tables right, is there a special thing I need to make to make this format work? Also I DON'T want subsample. I want a 1:1 ratio (from my understanding this means I encode 8x8 pixels into a 8x8 block to process through the steps that I am about to name, correct? How do I mark that in the header? With 0x11?).
Steps:
Grab the first 8x8 (top left) of this image.
For Y: DCTII-\>quant-\>RLE-\>Huffman Encode
then, for Cb: DCTII-\>quant-\>RLE-\>Huffman Encode
then, for Cr: DCTII-\>quant-\>RLE-\>Huffman Encode
repeat for top right -\> bottom left -\> bottom right 8x8 pixel block in image
write end of image tag, done.
In the data stream it should go: DC-Y -> AC-Y -> DC-Cb -> AC-Cb -> DC-Cr -> AC-Cr, and so forth yes? Is there any tag I need to insert between components, between DC/AC changes, or between 8x8 pixel blocks? I assume between components a EOB Huffman code is present (that's what I have currently).
Negative numbers: What format are they? 2's comp? -3 for example would be 101 in 2's comp (3 bit size), but in JPEG you would call this 2 bit size and only encode the 01 portion not the "sign" or the MSB bit right? 3 would be 011 in 2's comp 3 bit, but by the same logic its just 11 (2 bit size) and encoded without sign (MSB) in JPEG right? Anything I am missing?
DC vals: 3 components mean you keep track of 3 different previous DC vals right? For example Y-DC-prev is initialized to 0. Then the first Y-DC val is let's say 25. 25-0 = 25, we encode 25. We then remember 25 for the Y components next DC (not the Cb or Cr component right? They have their own "memories"?) Then DC-Y is lets say 40. Diff = 40-25 = 15, encode 15. remember 40 (not 15 right?). And so forth?
I followed the example here: WIKI. My code can get the exact values all the way down to RLE, which makes me think my Huffman encoding might have the bug. When I make a 16x16 image that basically repeats the image on Wikipedia in a 2x2 tile (also makes the image not grey scale since I force Cb Cr to have the same value as Y; I know the image should have a funky tint because of this, no worries.). I end up getting a semi-believable value for the top right block, then the rest turn into garbage. This led me to believe its my file organization or Huffman encoding that is going wrong. To do a quick check (this is from the Wikipedia example):
FORMAT: (RUNLENGTH, SIZE)(VALUE)
(0, 2)(-3);
(1, 2)(-3);
(0, 1)(-2);
(0, 2)(-6);
(0, 1)(2);
(0, 1)(-4);
(0, 1)(1);
(0, 2)(-3);
(0, 1)(1);
(0, 1)(1);
(0, 2)(5);
(0, 1)(1);
(0, 1)(2);
(0, 1)(-1);
(0, 1)(1);
(0, 1)(-1);
(0, 1)(2);
(5, 1)(-1);
(0, 1)(-1);
(0, 0);
standard Huffman AC-Y table in the spec: TABLE-PAGE154 says 0/2 is code 01. We know that -3 is 01 in 2's comp. So we append 0101 to the stream and then get to the next entry. 1/2 is 11011 from the table, -3 is still 01. So we append 1101101 to the stream and keep going.... all the way to the end where we see a 0x0 which is just 1010. Then we rinse and repeat for the 2 other components, then we rinse and repeat for the rest of the 8x8 pixel blocks in the image yes? The DC val was -26 which is 00110 (size 5) in 2's comp w/o MSB / sign. size 5 for DC-Y codes to 110 according to the Huffman table in the spec (page 153). This means the bit stream should start:
110_00110_01_01_11011_01_...
Obviously the _ are just for readability, I don't add those to the actual file.
I've been working on this for days, any help is much appreciated!!