I'm trying to stream a video from webcam to the local computer. The stream has resolution of 3840x2160 and 30fps. Computer I'm using is Mac Pro. However when I run it with next command:
ffmpeg -f avfoundation -framerate 30 -video_size 3840x2160 -pix_fmt nv12 -probesize "50M" -i "0" -pix_fmt nv12 -preset ultrafast -vcodec libx264 -tune zerolatency -f mpegts udp://192.168.1.5:5100/mystream
it has a latency of 3-4 seconds. This problem is not present in Chromium, when using MediaStream API stream is displayed in realtime.
I believe that's because Chromium has "dmb1" four char code supported:
+ (media::VideoPixelFormat)FourCCToChromiumPixelFormat:(FourCharCode)code {
switch (code) {
case kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange:
return media::PIXEL_FORMAT_NV12; // Mac fourcc: "420v".
case kCVPixelFormatType_422YpCbCr8:
return media::PIXEL_FORMAT_UYVY; // Mac fourcc: "2vuy".
case kCMPixelFormat_422YpCbCr8_yuvs:
return media::PIXEL_FORMAT_YUY2;
case kCMVideoCodecType_JPEG_OpenDML:
return media::PIXEL_FORMAT_MJPEG; // Mac fourcc: "dmb1".
default:
return media::PIXEL_FORMAT_UNKNOWN;
}
}
To set pixel format Chromium is using next piece of code:
NSDictionary* videoSettingsDictionary = @{
(id)kCVPixelBufferWidthKey : @(width),
(id)kCVPixelBufferHeightKey : @(height),
(id)kCVPixelBufferPixelFormatTypeKey : @(best_fourcc),
AVVideoScalingModeKey : AVVideoScalingModeResizeAspectFill
};
[_captureVideoDataOutput setVideoSettings:videoSettingsDictionary];
I tried doing the same thing in ffmpeg by changing avfoundation.m file. First I added new pixel format AV_PIX_FMT_MJPEG:
static const struct AVFPixelFormatSpec avf_pixel_formats[] = {
{ AV_PIX_FMT_MONOBLACK, kCVPixelFormatType_1Monochrome },
{ AV_PIX_FMT_RGB555BE, kCVPixelFormatType_16BE555 },
{ AV_PIX_FMT_RGB555LE, kCVPixelFormatType_16LE555 },
{ AV_PIX_FMT_RGB565BE, kCVPixelFormatType_16BE565 },
{ AV_PIX_FMT_RGB565LE, kCVPixelFormatType_16LE565 },
{ AV_PIX_FMT_RGB24, kCVPixelFormatType_24RGB },
{ AV_PIX_FMT_BGR24, kCVPixelFormatType_24BGR },
{ AV_PIX_FMT_0RGB, kCVPixelFormatType_32ARGB },
{ AV_PIX_FMT_BGR0, kCVPixelFormatType_32BGRA },
{ AV_PIX_FMT_0BGR, kCVPixelFormatType_32ABGR },
{ AV_PIX_FMT_RGB0, kCVPixelFormatType_32RGBA },
{ AV_PIX_FMT_BGR48BE, kCVPixelFormatType_48RGB },
{ AV_PIX_FMT_UYVY422, kCVPixelFormatType_422YpCbCr8 },
{ AV_PIX_FMT_YUVA444P, kCVPixelFormatType_4444YpCbCrA8R },
{ AV_PIX_FMT_YUVA444P16LE, kCVPixelFormatType_4444AYpCbCr16 },
{ AV_PIX_FMT_YUV444P, kCVPixelFormatType_444YpCbCr8 },
{ AV_PIX_FMT_YUV422P16, kCVPixelFormatType_422YpCbCr16 },
{ AV_PIX_FMT_YUV422P10, kCVPixelFormatType_422YpCbCr10 },
{ AV_PIX_FMT_YUV444P10, kCVPixelFormatType_444YpCbCr10 },
{ AV_PIX_FMT_YUV420P, kCVPixelFormatType_420YpCbCr8Planar },
{ AV_PIX_FMT_NV12, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange },
{ AV_PIX_FMT_YUYV422, kCVPixelFormatType_422YpCbCr8_yuvs },
{ AV_PIX_FMT_MJPEG, kCMVideoCodecType_JPEG_OpenDML }, //dmb1
#if !TARGET_OS_IPHONE && __MAC_OS_X_VERSION_MIN_REQUIRED >= 1080
{ AV_PIX_FMT_GRAY8, kCVPixelFormatType_OneComponent8 },
#endif
{ AV_PIX_FMT_NONE, 0 }
};
After that I tried to hardcode it:
pxl_fmt_spec = avf_pixel_formats[22];
ctx->pixel_format = pxl_fmt_spec.ff_id;
pixel_format = [NSNumber numberWithUnsignedInt:pxl_fmt_spec.avf_id];
capture_dict = [NSDictionary dictionaryWithObject:pixel_format
forKey:(id)kCVPixelBufferPixelFormatTypeKey];
[ctx->video_output setVideoSettings:capture_dict];
Code compiles and builds successfully, but when I run it with above command, without -pix_fmt specified, program enters infinite loop in get_video_config function:
while (ctx->frames_captured < 1) {
CFRunLoopRunInMode(kCFRunLoopDefaultMode, 0.1, YES);
}
It looks obvious that ffmpeg is not able to load first frame. My camera is more than capable of supporting this pixel and stream formats. I proved it with this piece of code which comes after ffmpeg selects which format to use for specified width, height and fps:
FourCharCode fcc = CMFormatDescriptionGetMediaSubType([selected_format formatDescription]);
char fcc_string[5] = { 0, 0, 0, 0, '\0'};
fcc_string[0] = (char) (fcc >> 24);
fcc_string[1] = (char) (fcc >> 16);
fcc_string[2] = (char) (fcc >> 8);
fcc_string[3] = (char) fcc;
av_log(s, AV_LOG_ERROR, "Selected format: %s\n", fcc_string);
Above code prints "Selected format: dmb1".
Can someone tell me why ffmpeg can't load first frame and how to add new pixel format in this library?
Also, any suggestion on how to resolve input latency of 3 seconds in some other way is more than welcome.
EDIT:
If you try setting any other pixel format in Chromium other then MJPEG there is a latency of 2 seconds. When I say "setting" I mean changing Chromium source code and recompiling it. I am pretty sure that the problem is in pixel format, because camera is sending dmb1 and ffmpeg doesn't know about that format.
Also latency is only present on MacOS.