FFMPEG changes pixel values when reading and saving png without modification

835 Views Asked by At

This is a toy problem that is the result of my trying to identify a bug within a video pipeline I'm working on. The idea is that I want to take a frame from a YUV420 video, modify it as an RGB24 image, and reinsert it. To do this I convert YUV420 -> YUV444 -> RGB -> YUV444 -> YUV420. Doing this without any modification should result in the same frame however I noticed slight color transformations.

I tried to isolate the problem using a toy 3x3 RGB32 png image. The function read_and_save_image reads the image and then saves it as new file. It returns the read pixel array. I run this function thrice successively using the output of the previous run as the input of the next. This is to demonstrate a perplexing fact. While passing an image through the function once causes the resulting image to have different pixel values, doing it twice does not change anything. Perhaps more confusing is that the pixel values returned by the function are all the same.

tldr; How can I load and save the toy image below using ffmpeg as a new file such that the pixel values of the new and original files are identical?

Here is the original image followed by the result from one and two passes through the function. Note that the pixel value displayed by when reading these images with Preview has changed ever so slightly. This becomes noticeable within a video.

Test image (very small) -> 3x3 test image file <-

Here are the pixel values read (note that after being loaded and saved there is a change):

original test image

test image after one pass

test image after two passes

Edit: here is an RGB24 frame extracted from a video I am using to test my pipeline. I had the same issue with pixel values changing after loading and saving with ffmpeg.

frame from video I was testing pipeline on

Here is a screenshot showing how the image is noticeably darker after ffmpeg. Same pixels on the top right corner of the image.

zoomed in top right corner

Here is the code of the toy problem:

import os
import ffmpeg
import numpy as np


def read_and_save_image(in_file, out_file, width, height, pix_fmt='rgb32'):
    input_data, _ = (
        ffmpeg
        .input(in_file)
        .output('pipe:', format='rawvideo', pix_fmt=pix_fmt)
        .run(capture_stdout=True)
    )
  
    frame = np.frombuffer(input_data, np.uint8)
    print(in_file,'\n', frame.reshape((height,width,-1)))
    
    save_data = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt=pix_fmt, s='{}x{}'.format(width, height))
            .output(out_file, pix_fmt=pix_fmt)
            .overwrite_output()
            .run_async(pipe_stdin=True)
    )
    
    

    save_data.stdin.write(frame.tobytes())
    save_data.stdin.close()
    #save_data.wait()

    return frame

try:
    test_img = "test_image.png"
    test_img_1 = "test_image_1.png"
    test_img_2 = "test_image_2.png"
    test_img_3 = "test_image_3.png"

    width, height, pix_fmt = 3,3,'rgb32'
    #width, height, pix_fmt = video_stream['width'], video_stream['height'],  'rgb24'
    test_img_pxls = read_and_save_image(test_img,test_img_1, width, height, pix_fmt)
    test_img_1_pxls = read_and_save_image(test_img_1,test_img_2, width, height, pix_fmt)
    test_img_2_pxls = read_and_save_image(test_img_2,test_img_3, width, height, pix_fmt)

    print(np.array_equiv(test_img_pxls, test_img_1_pxls))
    print(np.array_equiv(test_img_1_pxls, test_img_2_pxls))

except ffmpeg.Error as e:
    print('stdout:', e.stdout.decode('utf8'))
    print('stderr:', e.stderr.decode('utf8'))
    raise e


!mediainfo --Output=JSON --Full $test_img
!mediainfo --Output=JSON --Full $test_img_1
!mediainfo --Output=JSON --Full $test_img_2

Here is the console output of the program that shows that the pixel arrays read by ffmpeg are the same despite the images being different.

test_image.png 
 [[[253 218 249 255]
  [252 213 248 255]
  [251 200 244 255]]

 [[253 227 250 255]
  [249 209 236 255]
  [243 169 206 255]]

 [[253 235 251 255]
  [245 195 211 255]
  [226 103 125 255]]]
test_image_1.png 
 [[[253 218 249 255]
  [252 213 248 255]
  [251 200 244 255]]

 [[253 227 250 255]
  [249 209 236 255]
  [243 169 206 255]]

 [[253 235 251 255]
  [245 195 211 255]
  [226 103 125 255]]]
test_image_2.png 
 [[[253 218 249 255]
  [252 213 248 255]
  [251 200 244 255]]

 [[253 227 250 255]
  [249 209 236 255]
  [243 169 206 255]]

 [[253 235 251 255]
  [245 195 211 255]
  [226 103 125 255]]]
True
True
{
"media": {
"@ref": "test_image.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "4105",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-19 13:49:00",
"File_Modified_Date_Local": "2023-01-19 13:49:00"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "4105"
}
]
}
}

{
"media": {
"@ref": "test_image_1.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "128",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-24 15:31:58",
"File_Modified_Date_Local": "2023-01-24 15:31:58"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "128"
}
]
}
}

{
"media": {
"@ref": "test_image_2.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "128",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-24 15:31:59",
"File_Modified_Date_Local": "2023-01-24 15:31:59"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "128"
}
]
}
}

1

There are 1 best solutions below

13
On

FFmpeg prints a warning message "Incompatible pixel format 'bgra' for codec 'png', auto-selecting format 'rgba'".
It means that even that we want FFmpeg to save the PNG image in rgb32 pixels format, FFmpeg ignores the format, and save it as rgba.


It's not well documented, but rgb32 data order is b,g,r,a,b,g,r,a... (when a = 255).
It looks like the PNG pixel order applies rgba: r,g,b,a,r,g,b,a...

Even when setting the PNG pixel format to rgb32, the data is still stored in rgba pixel format (r color channel first).
When reading the PNG image the red and the blue channels are swapped.


Instead of using rgb32, we may use bgr32 pixel format.
The data order of bgr32 is the same as rgba.

I recommend you to use rgba or rgb24 pixel formats, because they are native to PNG image format.


For testing I used the following code sample that compares all kind of combinations of rgba, bgr32 and rgb32:

import os
import ffmpeg
import numpy as np
from PIL import Image


def read_and_save_image(in_file, out_file, width, height, pix_fmt='rgb32'):
    input_data, _ = (
        ffmpeg
        .input(in_file)
        .output('pipe:', format='rawvideo', pix_fmt=pix_fmt)
        .run(capture_stdout=True)
    )
  
    frame = np.frombuffer(input_data, np.uint8)
    print(in_file,'\n', frame.reshape((height,width,-1)))
    
    save_data = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt=pix_fmt, s='{}x{}'.format(width, height))
            .output(out_file, pix_fmt=pix_fmt)
            .overwrite_output()
            .run_async(pipe_stdin=True)
    )
    
    save_data.stdin.write(frame.tobytes())
    save_data.stdin.close()
    #save_data.wait()

    #return frame
    return frame.reshape((height, width, -1))



img = np.array([[[253, 218, 249, 255],
  [252, 213, 248, 255],
  [251, 200, 244, 255]],
 [[253, 227, 250, 255],
  [249, 209, 236, 255],
  [243, 169, 206, 255]],
 [[253, 235, 251, 255],
  [245, 195, 211, 255],
  [226, 103, 125, 255]]], np.uint8)

test_img = "test_image.png"
test_img_1 = "test_image_1.png"

# Save img to binary file (file size is 3*3*4 = 36 bytes).
with open("test_image.bin", 'wb') as f:
    img.tofile(f)

#ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgba', s='3x3').output(test_img, pix_fmt='rgba').overwrite_output().run()
ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgba', s='3x3').output("rgba.bin", pix_fmt='rgba', format='rawvideo').overwrite_output().run()  # rgba
ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgba', s='3x3').output("rgb32.bin", pix_fmt='rgb32', format='rawvideo').overwrite_output().run() # rgb32
ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgba', s='3x3').output("bgr32.bin", pix_fmt='bgr32', format='rawvideo').overwrite_output().run() # bgr32
ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgba', s='3x3').output(test_img, pix_fmt='rgba').overwrite_output().run() # PNG in rgba pixel format

png_img = np.array(Image.open(test_img)) # Read the input PNG image (use PIL for keeping the order r,g,b,a,r,g,b,a...)

rgba = np.fromfile("rgba.bin", dtype=np.uint8).reshape(3, 3, 4)
rgb32 = np.fromfile("rgb32.bin", dtype=np.uint8).reshape(3, 3, 4)
bgr32 = np.fromfile("bgr32.bin", dtype=np.uint8).reshape(3, 3, 4)

print(np.array_equiv(img, png_img))  # True
print(np.array_equiv(img, bgr32))  # True
print(np.array_equiv(img, rgb32))  # False

width, height, pix_fmt = 3, 3, 'bgr32' #'rgb32'
test_img_pxls = read_and_save_image(test_img, test_img_1, width, height, pix_fmt)

print(np.array_equiv(img, test_img_pxls))  # True
#print(np.array_equiv(test_img_pxls, test_img_1_pxls))
#print(np.array_equiv(test_img_1_pxls, test_img_2_pxls))

png_img1 = np.array(Image.open(test_img_1)) # Read test_img_1 PNG image (use PIL for keeping the order r,g,b,a,r,g,b,a...)
print(np.array_equiv(img, png_img1))  # True

ffmpeg.input('test_image.bin', format='rawvideo', pix_fmt='rgb32', s='3x3').output("test_image_rgb32.png", pix_fmt='rgb32').overwrite_output().run() # PNG in rgb32 pixel format
png_img_rgb32 = np.array(Image.open("test_image_rgb32.png")) # Read the input PNG image that stored as rgb32
print(np.array_equiv(img, png_img_rgb32))  # False

Update:

Regardless rgb32 issue, the differences are related to the embedded color profile of the original test image (test_image.png).

The embedded color profile (and other Exif data), also explains the fact the the PNG file size is 4105 when the image data is much smaller.

  • When opening the original image in image viewer that doesn't respect the color profile (like paint.net in Windows), the RGB values of the bottom right pixel are [125, 103, 226].
  • When opening the original image in image viewer that respects the color profile (like GIMP in Windows), the values are converted to [129, 102, 234].

The following FFmpeg command converts test_image.png to raw format:

ffmpeg -i test_image.png -pix_fmt rgba -f rawvideo test_image.bin

FFmpeg conversion ignores the color profile (does not apply the color conversion as GIMP), and also removes the color profile.

Note:
The above command is equivalent to the following Python code:
input_data, _ = ffmpeg.input(in_file).output('pipe:', format='rawvideo', pix_fmt=pix_fmt).run(capture_stdout=True)


The following command converts test_image.bin back to PNG:

ffmpeg -f rawvideo -s 3x3 -pixel_format rgba -i test_image.bin test_image_converted.png

The new PNG image is without embedded color profile.

That explains the fact that after the first iteration, the pixels values stop changing (test_image_1.png = test_image_2.png = test_image_3.png).


When viewing the pixel values of test_image_converted.png in Windows, the values are the same as the values of the NumPy array in Python.
(Image with no color profile is assumed to comply sRGB profile, and Windows nominal profile is sRGB).

When viewing the pixel values in Mac, the values may still look different (depends on the image viewer). In Mac the default color profile may not be sRGB, and the image viewer software may still apply color profile conversion.