This is a toy problem that is the result of my trying to identify a bug within a video pipeline I'm working on. The idea is that I want to take a frame from a YUV420 video, modify it as an RGB24 image, and reinsert it. To do this I convert YUV420 -> YUV444 -> RGB -> YUV444 -> YUV420. Doing this without any modification should result in the same frame however I noticed slight color transformations.
I tried to isolate the problem using a toy 3x3 RGB32 png image. The function read_and_save_image
reads the image and then saves it as new file. It returns the read pixel array. I run this function thrice successively using the output of the previous run as the input of the next. This is to demonstrate a perplexing fact. While passing an image through the function once causes the resulting image to have different pixel values, doing it twice does not change anything. Perhaps more confusing is that the pixel values returned by the function are all the same.
tldr; How can I load and save the toy image below using ffmpeg as a new file such that the pixel values of the new and original files are identical?
Here is the original image followed by the result from one and two passes through the function. Note that the pixel value displayed by when reading these images with Preview has changed ever so slightly. This becomes noticeable within a video.
Here are the pixel values read (note that after being loaded and saved there is a change):
Edit: here is an RGB24 frame extracted from a video I am using to test my pipeline. I had the same issue with pixel values changing after loading and saving with ffmpeg.
frame from video I was testing pipeline on
Here is a screenshot showing how the image is noticeably darker after ffmpeg. Same pixels on the top right corner of the image.
Here is the code of the toy problem:
import os
import ffmpeg
import numpy as np
def read_and_save_image(in_file, out_file, width, height, pix_fmt='rgb32'):
input_data, _ = (
ffmpeg
.input(in_file)
.output('pipe:', format='rawvideo', pix_fmt=pix_fmt)
.run(capture_stdout=True)
)
frame = np.frombuffer(input_data, np.uint8)
print(in_file,'\n', frame.reshape((height,width,-1)))
save_data = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt=pix_fmt, s='{}x{}'.format(width, height))
.output(out_file, pix_fmt=pix_fmt)
.overwrite_output()
.run_async(pipe_stdin=True)
)
save_data.stdin.write(frame.tobytes())
save_data.stdin.close()
#save_data.wait()
return frame
try:
test_img = "test_image.png"
test_img_1 = "test_image_1.png"
test_img_2 = "test_image_2.png"
test_img_3 = "test_image_3.png"
width, height, pix_fmt = 3,3,'rgb32'
#width, height, pix_fmt = video_stream['width'], video_stream['height'], 'rgb24'
test_img_pxls = read_and_save_image(test_img,test_img_1, width, height, pix_fmt)
test_img_1_pxls = read_and_save_image(test_img_1,test_img_2, width, height, pix_fmt)
test_img_2_pxls = read_and_save_image(test_img_2,test_img_3, width, height, pix_fmt)
print(np.array_equiv(test_img_pxls, test_img_1_pxls))
print(np.array_equiv(test_img_1_pxls, test_img_2_pxls))
except ffmpeg.Error as e:
print('stdout:', e.stdout.decode('utf8'))
print('stderr:', e.stderr.decode('utf8'))
raise e
!mediainfo --Output=JSON --Full $test_img
!mediainfo --Output=JSON --Full $test_img_1
!mediainfo --Output=JSON --Full $test_img_2
Here is the console output of the program that shows that the pixel arrays read by ffmpeg are the same despite the images being different.
test_image.png
[[[253 218 249 255]
[252 213 248 255]
[251 200 244 255]]
[[253 227 250 255]
[249 209 236 255]
[243 169 206 255]]
[[253 235 251 255]
[245 195 211 255]
[226 103 125 255]]]
test_image_1.png
[[[253 218 249 255]
[252 213 248 255]
[251 200 244 255]]
[[253 227 250 255]
[249 209 236 255]
[243 169 206 255]]
[[253 235 251 255]
[245 195 211 255]
[226 103 125 255]]]
test_image_2.png
[[[253 218 249 255]
[252 213 248 255]
[251 200 244 255]]
[[253 227 250 255]
[249 209 236 255]
[243 169 206 255]]
[[253 235 251 255]
[245 195 211 255]
[226 103 125 255]]]
True
True
{
"media": {
"@ref": "test_image.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "4105",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-19 13:49:00",
"File_Modified_Date_Local": "2023-01-19 13:49:00"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "4105"
}
]
}
}
{
"media": {
"@ref": "test_image_1.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "128",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-24 15:31:58",
"File_Modified_Date_Local": "2023-01-24 15:31:58"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "128"
}
]
}
}
{
"media": {
"@ref": "test_image_2.png",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "png",
"Format": "PNG",
"FileSize": "128",
"StreamSize": "0",
"File_Modified_Date": "UTC 2023-01-24 15:31:59",
"File_Modified_Date_Local": "2023-01-24 15:31:59"
},
{
"@type": "Image",
"Format": "PNG",
"Format_Compression": "LZ77",
"Width": "3",
"Height": "3",
"BitDepth": "32",
"Compression_Mode": "Lossless",
"StreamSize": "128"
}
]
}
}
FFmpeg prints a warning message
"Incompatible pixel format 'bgra' for codec 'png', auto-selecting format 'rgba'"
.It means that even that we want FFmpeg to save the PNG image in
rgb32
pixels format, FFmpeg ignores the format, and save it asrgba
.It's not well documented, but
rgb32
data order isb
,g
,r
,a
,b
,g
,r
,a
... (whena
=255
).It looks like the PNG pixel order applies
rgba
:r
,g
,b
,a
,r
,g
,b
,a
...Even when setting the PNG pixel format to
rgb32
, the data is still stored inrgba
pixel format (r
color channel first).When reading the PNG image the red and the blue channels are swapped.
Instead of using
rgb32
, we may usebgr32
pixel format.The data order of
bgr32
is the same asrgba
.I recommend you to use
rgba
orrgb24
pixel formats, because they are native to PNG image format.For testing I used the following code sample that compares all kind of combinations of
rgba
,bgr32
andrgb32
:Update:
Regardless
rgb32
issue, the differences are related to the embedded color profile of the original test image (test_image.png
).The embedded color profile (and other Exif data), also explains the fact the the PNG file size is 4105 when the image data is much smaller.
[125, 103, 226]
.[129, 102, 234]
.The following FFmpeg command converts
test_image.png
to raw format:ffmpeg -i test_image.png -pix_fmt rgba -f rawvideo test_image.bin
FFmpeg conversion ignores the color profile (does not apply the color conversion as GIMP), and also removes the color profile.
Note:
The above command is equivalent to the following Python code:
input_data, _ = ffmpeg.input(in_file).output('pipe:', format='rawvideo', pix_fmt=pix_fmt).run(capture_stdout=True)
The following command converts
test_image.bin
back to PNG:ffmpeg -f rawvideo -s 3x3 -pixel_format rgba -i test_image.bin test_image_converted.png
The new PNG image is without embedded color profile.
That explains the fact that after the first iteration, the pixels values stop changing (test_image_1.png = test_image_2.png = test_image_3.png).
When viewing the pixel values of test_image_converted.png in Windows, the values are the same as the values of the NumPy array in Python.
(Image with no color profile is assumed to comply sRGB profile, and Windows nominal profile is sRGB).
When viewing the pixel values in Mac, the values may still look different (depends on the image viewer). In Mac the default color profile may not be sRGB, and the image viewer software may still apply color profile conversion.