Convert 16bit Grayscale PNG to HEVC/x265

4k Views Asked by At

I want to convert a 12bit image signal to HEVC for effective compression. Because I need to be able to reconstruct the original 12bit signal, the compression needs to be losslessly reversible. At the moment I have the data as 16-bit PNG files.

My first try was using ffmpeg:

ffmpeg -y -framerate 1 -i input.png -c:v libx265 -x265-params "lossless=1" output.mp4

Unfortunately the output is not reversible. When extracting the image from the mp4, the pixel values are slightly off.

ffmpeg -i output.mp4 -vframes 1 reconstructed.png

Following Answer suggest converting the input to YUV444 first to avoid unexpected behavior by ffmpeg: Lossless x264 compression

I have failed so far to successfully convert my 16bit file to YUV, convert it to x256 and receive a correct reconstruction when decoding.

Is there a straight forward way to convert 16bit images to HEVC?

2

There are 2 best solutions below

3
On BEST ANSWER

I found a solution with minor rounding errors:

Encoding:

  • Based on the following post: How to render png's as h.265 12 bit video?
    Use can use the following codec parameters: -x265-params lossless=1 -pix_fmt yuv444p12le for lossy 12 bpc encoding.

  • By trial and error, I realized that the 12 bits data must be in the upper 12 bits of each 16 bits element. You need to scale up the input pixels by 16 for placing the data in the upper bits.
    (Scaling by 16 is equivalent to left shifting the uint16 elements by 4).
    For scaling pixels up you can use colorlevels video filter:
    -vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625

The following command encodes a single frame:

 ffmpeg -i input.png -vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625 -c:v libx265 -x265-params lossless=1 -pix_fmt yuv444p12le output.mkv

Decoding:

  • For decoding, you need to divide the pixels by 16 for placing the data in the lower 12 bits.
    (Dividing by 16 is equivalent to right shifting the uint16 elements by 4).
    I couldn't find a solution using colorlevels, so I used curves filter:
    -vf "curves=r='0/0 1.0/0.0625':g='0/0 1.0/0.0625':b='0/0 1.0/0.0625'"
  • The suitable pixel format for 16 bits PNG is rgb48be.

The following command decodes a single frame (and divide by 16):

ffmpeg -i output.mkv -vf "curves=r='0/0 1.0/0.0625':g='0/0 1.0/0.0625':b='0/0 1.0/0.0625'" -pix_fmt rgb48be reconstructed.png

Differences:
The maximum absolute difference between input.png and reconstructed.png is 4 levels.
The reason for the difference is probably rounding errors caused by converting RGB to YUV and back.


I used the following MATLAB code for testing:

I = imread('peppers.png');

% Build 10 PNG images (used as input).
for i = 1:10
    J = insertText(I, [size(I,2)/2-18, size(I,1)/2-36], num2str(i), 'FontSize', 72);
    J = imnoise(im2double(J), 'gaussian', 0, 0.01); % Add some noise
    J = uint16(round(J*4095)); % Convert to 12 bits range (range [0, 4095])
    imwrite(J, sprintf('input%02d.png', i), 'fmt', 'png', 'BitDepth', 16, 'Mode', 'lossless'); % Write to PNG file
end

 %Encode video file using x265 codec, and 12 bits YUV444 format. 
[status, cmdout] = system('ffmpeg -y -i input%02d.png -vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625 -c:v libx265 -x265-params lossless=1 -pix_fmt yuv444p12le output.mkv');
if (status ~= 0), disp(cmdout);end

% Decode output.mkv into 10 PNG image files
[status, cmdout] = system('ffmpeg -y -i output.mkv -vf "curves=r=''0/0 1.0/0.0625'':g=''0/0 1.0/0.0625'':b=''0/0 1.0/0.0625''" -pix_fmt rgb48be reconstructed%02d.png');
if (status ~= 0), disp(cmdout);end

% Compare input and output:
for i = 1:10
    I = imread(sprintf('input%02d.png', i));
    J = imread(sprintf('reconstructed%02d.png', i));
    max_abs_diff = max(max(max(imabsdiff(I, J))));
    disp(['max_abs_diff = ', num2str(max_abs_diff)]);
end

Update:

Working with Grayscale format:
When working Grayscale, you don't need to convert the pixel format to YUV.
Converting from Grayscale to YUV444 multiplies the size of input data by 3, so it's better to avoid the conversion.

The following command encodes a single Grayscale frame:

 ffmpeg -i input.png -vf "curves=all='0/0 0.0625/1.0'" -c:v libx265 -x265-params lossless=1 -pix_fmt gray12le -bsf:v hevc_metadata=video_full_range_flag=1 output.mkv

The following command decodes a single Grayscale frame (and divide by 16):

ffmpeg -i output.mkv -vf "curves=all='0/0 1.0/0.0625'" -pix_fmt gray16be reconstructed.png

The maximum absolute difference is 2.


Note about using -bsf:v hevc_metadata=video_full_range_flag=1:

In H.265, the default range of Y color channel is "limited range".
For 8 bits the "limited range" applies [16, 235].
For 12 bits the "limited range" applies [256, 3760].
When using "full range" [0, 255] for 8 bits or [0, 4095] for 12 bits, you need to specify it in the stream's Metadata.
The way do set the Metadata with FFmpeg is using a bitstream filter.

0
On

I was trying to achieve the same thing for grayscale 10 bit data.

Thanks to Paul B Mahol on the ffmpeg-user mailing list, I have been able to solve the remaining rounding errors by using temporary rawvideo files and tricking the rawvideo demuxer into interpreting the files with the bitdepth I wanted.

I assume the same solution applies for 12 bit data and could be extended to RGB data. The ffmpeg command lines can be found in my related (almost duplicate) question: https://stackoverflow.com/a/69874453/17261462