C++ - Stuck with YoloV4, ONNX and TensorRT

80 Views Asked by At

I'm doing some detection using YoloV4/C++/OpenCV and it's running pretty good. Hower, to improve time consumption I'm trying to move everything to NVIDIA TensorRT and I'm feeling lost there.

I converted the .weights file to ONNX using the TensorRT tools, then converted the ONNX model to TensorRT engine like this :

void ONNXConvert()
{
    MyLogger logger;
    nvinfer1::IBuilder* builder = nvinfer1::createInferBuilder(logger);
    nvinfer1::INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<int>(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));

    // Load ONNX model
    const auto parser = nvonnxparser::createParser(*network, logger);

    // Parse the ONNX model
    // Some code here...

    std::ifstream onnxFile(onnxModelFile, std::ios::binary);
    if (!onnxFile)
    {
        std::cerr << "Error opening ONNX model file. " << onnxModelFile << std::endl;
        return;
    }
    onnxFile.seekg(0, onnxFile.end);
    const size_t modelSize = onnxFile.tellg();
    onnxFile.seekg(0, onnxFile.beg);

    // Allocate buffer to hold the ONNX model
    std::vector<char> onnxModelBuffer(modelSize);
    onnxFile.read(onnxModelBuffer.data(), modelSize);

    if (!parser->parse(onnxModelBuffer.data(), modelSize))
    {
        std::cerr << "Error parsing ONNX model." << std::endl;
        return;
    }

    // Create a builder configuration
    nvinfer1::IBuilderConfig* config = builder->createBuilderConfig();

    // Set configuration options as needed
    config->setMemoryPoolLimit(nvinfer1::MemoryPoolType::kWORKSPACE, 1 << 30);

    nvinfer1::IHostMemory* serializedEngine = builder->buildSerializedNetwork(*network, *config);
    std::cout << "Number of layers in the network: " << network->getNbLayers() << std::endl;
    std::ofstream outFile("yolov4.engine", std::ios::binary);
    outFile.write(reinterpret_cast<const char*>(serializedEngine->data()), serializedEngine->size());
    outFile.close();

    builder->destroy();
    network->destroy();
    serializedEngine->destroy();
}

This done, I can load the generated engine and perform the inference, everything seems to goes well until I try to parse the detection results.

I want to know the classes probabilities and the bounding boxes coordinates, but everything I have is inconsistent values.

From my YoloV4 config, I know I have :

  • 20 classes
  • Input width = 608
  • Input height = 608
  • Channels = 3
  • 9 anchors with dimensions { 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401 }

After the inference, I have 2 output buffers :

  • a 1x22743x1x4 where I guess I will find the bounding boxes coordinates
  • a 1x22743x20 where I guess I will find the classes probabilities

And this where I'm getting lost. Why are there 22743 detections ? How is this number calculated ? How must I parse the detections to correctly compute the coordinates and classes probabilities ?

I innocently tried to directly parse the outputs like that :

for (int d = 0; d < 22743; d++)
{
    float maxProb = -1000.0f;
    int classId = -1;
    for (int c = 0; c < 20; c++)
    {
        if (classes[d * 20 + c] > maxProb)
        {
            maxProb = classes[d * 20 + c];
            classId = c;
        }
    }

    if (maxProb > CONFIDENCE_THRESHOLD)
    {
        float boxX = boxes[d * 4];
        float boxY = boxes[d * 4 + 1];
        float boxW = boxes[d * 4 + 2];
        float boxH = boxes[d * 4 + 3];
    }
}

But everything I got is tiny probabilities (like < 1E-05), and tiny and sometimes negatives boxes coordinates.

I understand I'm supposed to use what I know about the anchors but I'm really not sure how.

Could someone give me hand about that ? Every help will really be appreciated.

0

There are 0 best solutions below