FFmpeg cant recognize 3 channels with each 32 bit

721 Views Asked by At

I am writing the linearized depth buffer of a game to openEXR using FFmpeg. Unfortunately, FFmpeg does not adhere to the openEXR file specification fully (like allowing unsigned integer for one channel) so I am writing one float channel to openEXR, which is put into the green channel with this command -f rawvideo -pix_fmt grayf32be -s %WIDTH%x%HEIGHT% -r %FPS% -i - -vf %DEFVF% -preset ultrafast -tune zerolatency -qp 6 -compression zip1 -pix_fmt gbrpf32le %NAME%_depth_%d.exr.

The float range is from 0F to 1F and it is linear. I can confirm that the calculation and linearization is correct by testing 16 bit integer (per pixel component) PNG in Blender compositor. The 16 bit integer data is written like this short s = (short) (linearzieDepth(depth) * (Math.pow(2,16) - 1)) whereas for float the linearized value is directly written to OpenEXR without multiplying with a value.

However, when viewing the openEXR file it doesn't have the same "gradient" as the 16 bit png... when viewing them side by side, it appears as if the values near 0 are not linear, and they are not as dark as they should be like in the 16 bit png. (And yes, I set the image node to linear), and comparing it with 3d tracking data from the game I cant reproduce the depth and cant mask things using the depth buffer where as with the png I can.

How is it possible for a linear float range to turn out so different to a linear integer range in an image?

UPDATE:

I now write 3 channels to the ffmpeg with this code

float f2 = this.linearizeDepth(depth);

buffer.putFloat(f2);
buffer.putFloat(0);
buffer.putFloat(0);

the byte buffer is of the size width * height * 3 * 4 -> 3 channels with each 4 bytes. The command is now -f rawvideo -pix_fmt gbrpf32be -s %WIDTH%x%HEIGHT% -r %FPS% -i - -vf %DEFVF% -preset ultrafast -tune zerolatency -qp 6 -compression zip1 -pix_fmt gbrpf32le %NAME%_depth_%d.exr which should mean that the input (byte buffer) is expecting 32 bit floats with 3 channels. This is how it turns out

FFmpeg is somehow splitting up channels or whatever... could be a bug, could be my fault?

1

There are 1 best solutions below

6
On BEST ANSWER

The issue is the color conversion from grayf32be to gbrpf32le.

Assuming source pixel range is [0, 1] we may add format conversion filter: -vf format=rgb48le before converting the pixel format to gbrpf32le.

It also looks like FFmpeg ignores the range arguments, the fix is adding scale filter: scale=in_range=full:out_range=full.

Updated command:

ffmpeg -y -f rawvideo -pix_fmt grayf32be -src_range 1 -s 192x108 -i in.raw -vf "scale=in_range=full:out_range=full,format=rgb48le" -vcodec exr -compression zip1 -pix_fmt gbrpf32le -dst_range 1 out.exr

Reproducible example:

  • Create 16 bits Tiff image (used as reference):

     ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1:duration=1 -pix_fmt gray16le in.tif
    
  • Convert the Tiff to float (big endian):

     ffmpeg -y -src_range 1 -i in.tif -pix_fmt grayf32be -dst_range 1 -f rawvideo in.raw
    
  • Convert from raw to OpenEXR format:

     ffmpeg -y -f rawvideo -pix_fmt grayf32be -src_range 1 -s 192x108 -i in.raw -vf "scale=in_range=full:out_range=full,format=rgb48le" -vcodec exr -compression zip1 -pix_fmt gbrpf32le -dst_range 1 out.exr
    

Python code for comparing the differences:

img1 = cv2.imread('in.tif', cv2.IMREAD_UNCHANGED)
img2 = cv2.imread('out.exr', cv2.IMREAD_UNCHANGED)

green_ch = img2[:, :, 1]  # Green channel

max_abs_diff = np.max(np.abs(green_ch*65535 - img1.astype(float)))

The maximum difference is 3 (out of 65535 levels).
We may have to play a bit with the filters arguments...


Since there are issues with FFmpeg color conversion and range conversion (so it seems), there is a change that you are not going to get the desired results until the issues are fixed.


Update:

Looks like it's working when the pixel format of the input is grayf32be (three color channels planar format).

Testing:

  • Create 16 bits Tiff image (used as reference):

     ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1:duration=1 -pix_fmt gray16le in.tif
    
  • Convert the Tiff to float (big endian):

     ffmpeg -y -src_range 1 -i in.tif -pix_fmt grayf32be -dst_range 1 -f rawvideo in.raw
    
  • Duplicate the "Grayscale plane" three times for getting 3 identical color planes (using "concat protocol" for avoiding any color conversion issues):

     ffmpeg -y -f rawvideo -pix_fmt grayf32be -s 192x108 -i "concat:in.raw|in.raw|in.raw" -f rawvideo in3.raw
    
  • Convert from 3 color channels raw to OpenEXR format:

     ffmpeg -y -f rawvideo -pix_fmt gbrpf32be -s 192x108 -i in3.raw -vcodec exr -compression zip1 -pix_fmt gbrpf32le out.exr
    

Python code for comparing the differences (compare 3 color channels):

img1 = cv2.imread('in.tif', cv2.IMREAD_UNCHANGED)
img2 = cv2.imread('out.exr', cv2.IMREAD_UNCHANGED)

blue_ch = img2[:, :, 0]  # Blue channel
green_ch = img2[:, :, 1]  # Green channel
red_ch = img2[:, :, 2]  # Red channel

max_red_abs_diff = np.max(np.abs(red_ch*65535 - img1.astype(float)))
max_green_abs_diff = np.max(np.abs(green_ch*65535 - img1.astype(float)))
max_blue_abs_diff = np.max(np.abs(blue_ch*65535 - img1.astype(float)))

The maximum difference is 0.001953125 (negligible).