Python Handling H264 Frames for Live Stream from Eufy Server

67 Views Asked by At

I am currently using the Eufy Security WebSocket Server, a server wrapper constructed around the eufy-security-client library, enabling access via a WebSocket interface. I have developed a python version of the client, where I attempt to display the live stream using the device.start_livestream command, as outlined here.

In short, the web server continuously returns a buffer of video frames in H264 format. while, I read this in my Python script and attempt to render it on a GUI. However, I encounter an issue where a significant number of frames are missed. This is potentially due to the compression of consecutive frames, a process known as inter-frame compression.

The solution probably lies in implementing a buffering or packet reassembly logic to ensure complete frames are processed. Despite many attempts and exploration of various ways, I couldn't figure it out.

Here's my python full code:

import websocket
import json
import av
import cv2

buffer = bytearray()

def is_h264_complete(buffer):
    # Convert the buffer to bytes
    buffer_bytes = bytes(buffer)

    # Look for the start code in the buffer
    start_code = bytes([0, 0, 0, 1])
    positions = [i for i in range(len(buffer_bytes)) if buffer_bytes.startswith(start_code, i)]

    # Check for the presence of SPS and PPS
    has_sps = any(buffer_bytes[i+4] & 0x1F == 7 for i in positions)
    has_pps = any(buffer_bytes[i+4] & 0x1F == 8 for i in positions)

    return has_sps and has_pps

def on_message(ws, message):
    data = json.loads(message)
    message_type = data["type"]
    if message_type == "event" and data["event"]["event"] == "livestream video data":
        image_buffer = data["event"]["buffer"]["data"]
        if not is_h264_complete(image_buffer):
            print(f"Error! incomplete h264: {len(image_buffer)}")
            return
        
        buffer_bytes = bytes(image_buffer)
        packet = av.Packet(buffer_bytes)
        codec = av.CodecContext.create('h264', 'r')
        frames = codec.decode(packet)

        # Display the image
        for frame in frames:
            image = frame.to_ndarray(format='bgr24')
            # Put the length of the buffer on the image
            cv2.putText(image, f"Buffer Length: {len(image_buffer)}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)
            cv2.imshow('Image', image)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break

def on_error(ws, error):
    print(f"Error: {error}")

def on_close(ws):
    print("Connection closed")

def on_open(ws):
    print("Connection opened")
    # Send a message to the server
    ws.send(json.dumps({"messageId" : "start_listening", "command": "start_listening"}))  # replace with your command and parameters
    ws.send(json.dumps({"command": "set_api_schema", "schemaVersion" : 20}))
    
    ws.send(json.dumps({"messageId" : "start_livestream", "command": "device.start_livestream", "serialNumber": "T8410P4223334EBE"}))  # replace with your command and parameters

if __name__ == "__main__":
    websocket.enableTrace(False)
    ws = websocket.WebSocketApp("ws://localhost:3000",  # replace with your server URI
                                on_message=on_message,
                                on_error=on_error,
                                on_close=on_close)
    ws.on_open = on_open
    ws.run_forever()
1

There are 1 best solutions below

0
Christoph On

Your method is_h264_complete looks wrong in this context to me, the function looks like to check for sync points in a stream like to initalize a decoder.

Currently u only decode this sync points (I-Frame) and no P,B frames. You could remove this function, also create the Codec Context outside of the on_message callback. Use connect callback or so, because i asume you get multiple h264 payloads in your message callback. Currently you create a new Context eg decoder on each message what is crazy.

Create the codec Context once, outside or on first syncpoint, and on a new message feed the data via av.Packet to Decoder and call decode. ffmpeg is smart enough to parse the bytestream.

On disconnect U can Close eg free the Codec Context.