I'm currently using DETR for object detection. I want to convert it as follows:
pytorch -> onnx -> tensorrt I have the code to do so and tested the model achieving the same performance in all formats. the thing is, the model is in fp32 and when I convert it to fp16 I lose a lot of performance. My idea is to convert some layers to fp16 and leave the rest as fp32 to keep as much accuracy.
my question is. how to convert specific layers of the tensorrt model into fp16? I couldn't find any documentation on this. any and all help is appreciated.
infer using mixed precision in tensorrt
145 Views Asked by Faisal Hejary At
0
There are 0 best solutions below
Related Questions in ONNX
- AWS Lambda - How to Put ONNX Models in AWS Layers
- pytorch model -> onnx -> tensorflow
- ScatterND Plugin not found while converting onnx into tensorrt model
- How to convert channel last keras model into channel first ONNX model
- Segmentation Fault when exporting to onnx a quantized Pytorch model
- Error on running Super Resolution Model from ONNX
- onnxruntime: cannot import name 'get_all_providers'
- Best Way to Obfuscate My DL Models and Python?
- Remove DecodeJpeg from tensorflow graph
- How to impelement post-proccesing for yolo v3 or v4 onnx models in ML.Net
- Unable to "set_base_margin" and "predict with model's best_ntree_limit" using ONNX runtime to do prediction on XGBoost model
- What is tensorflow concrete function outputs correspond to structured_outputs?
- Convert .pth Pytorch model to format readable by OpenCv
- How do you run a ONNX model on a GPU?
- coreML model converted from pytorch model giving the wrong prediction probabilities
Related Questions in TENSORRT
- Run Tensorflow with NVIDIA TensorRT Inference Engine
- ScatterND Plugin not found while converting onnx into tensorrt model
- How to train Faster R-CNN (TensorRT) on my dataset for NVIDIA Jetson Nano
- Force TensorRT to run on CPU, or convert trt model back to onnx?
- How to convert a U-Net segmentation model to TensorRT on NVIDIA Jetson Nano ? (process killed error)
- TensorRT is not using float16 (or how to check?)
- use NCU with tensorRT, but got No kernels were profiled
- TensorRT seems to lack common functionality
- Tensorflow saved_model loading issue in version 2.7
- Unable to use tensorflow with GPU. Error: Could not load dynamic library 'libnvinfer.so.7'. Export LD_LIBRARY_PATH could not solve issue
- webAI-user.bat Python error message after installing TensorRT
- How to convert torchvision MaskRCNN model to TensorRT?
- Couldn't create backend context from arcface engine file
- Why does TensorRT enqueueV2 take longer time when using more isolated threads in C++?
- How to pre-compile TensorRT models on a different machine than what would be used for inference?
Related Questions in TENSORRT-PYTHON
- How to convert torchvision MaskRCNN model to TensorRT?
- infer using mixed precision in tensorrt
- Unable to install python3-libnvinfer package, unmet dependencies
- Tensor RT installation
- Tensorrt python API set batch size
- How do I speed up the YOLO v3-v4 inferencing?
- Can't drive the nvidia GPU on Ubuntu server, finally Skipping registering GPU devices
- Tensorrt build engine gives error with static input dimensions
- cuMemcpyHtoDAsync failed: invalid argument by using TensorRT (Python)
- Modules lost upgrading to python 3.11
- Install tensorrt with custom plugins
- Throwing error for Tensorflow-TensorRT inference model
- Installation Failure with Stream Diffusion
- Inference speed isn't improved with tensor-rt compared to regular cuda
- AttributeError: 'RecursiveScriptModule' object has no attribute 'config' when use HF pipeline with TensorRT model
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?