1 (edited by anders.nilsson 18-01-2023 15:55:23)

Topic: Error ONNX model, TensorRT

[W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

[W] Could not read timing cache from: C:\Users\......\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.4.onnx.min64x64_opt2560x1440_max2560x1440_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1070_a8b3b7a9.engine.cache.


[TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-pro … l#env-vars



Is something wrong with  NVIDIA TensorRT?
What to do?
I have read the:
Module loading

CUDA_MODULE_LOADING

DEFAULT, LAZY, EAGER

Specifies the module loading mode for the application. When set to EAGER, all kernels from a cubin, fatbin or a PTX file are fully loaded upon corresponding cuModuleLoad* API call. This is the same behavior as in all preceding CUDA releases. When set to LAZY, loading of a specific kernel is delayed to the point a CUfunc handle is extracted with cuModuleGetFunction API call. This mode allows for lowering initial module loading latency and decreasing initial module-related device memory consumption, at the cost of higher latency of cuModuleGetFunction API call. Default behavior is EAGER. Default behavior may change in future CUDA releases.

But i dont understand!

Re: Error ONNX model, TensorRT

1 and 3rd line are not errors

Re: Error ONNX model, TensorRT

there're no errors here
just wait for the end of the process