You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to save a serialized tensorRT optimized model using torch_tensorrt from one environment and then load it in another environment (different GPUs. one has Quadro M1000M, and another has Tesla P100.
In both environments I don't have full sudo control where I can install whatever I want (i.e. can't change nvidia driver), but I am able to install different cuda toolkits locally, same with pip installs with wheels.
What you have already tried
I have tried (ones marked with @ are ones I can't change):
env #1 = @1. Tesla P100 @2. Nvidia driver 460
3. CUDA 11.3 (checked via torch.version.cuda). nvidia-smi shows 11.2. has many cuda versions installed from 10.2 to 11.4
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (conda installed)
env #2 = @1. Quadro M1000M @2. Nvidia driver 455
3. CUDA 11.3(checked via torch.version.cuda, backwards compatibilty mode I believe, but technically 11.3 requires 460+ nvidia driver according to the compatibility table). nvidia-smi shows 11.1. has 10.2 version available aside from 11.3 I installed.
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (pip installed)
So as you can see the only difference is really the GPU and the NVIDIA driver (455 vs 460).
Is this supposed to work?
On env#1, I can torch_tensorrt compile any models
On env#2, I run into issues if I try to compile any slightly complex models (i.e. resnet34) where it says:
WARNING: [Torch-TensorRT] - Dilation not used in Max pooling converter
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.5.1
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 1: [wrapper.cpp::plainGemm::197] Error Code 1: Cublas (CUBLAS_STATUS_NOT_SUPPORTED)
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
If I try to "torch.jit.load" any model made in env #1 (even the simplest ones like a model with 1 conv2d layer) on env #2, I get the following error msg:
~/.local/lib/python3.6/site-packages/torch/jit/_serialization.py in load(f, map_location, _extra_files)
159 cu = torch._C.CompilationUnit()
160 if isinstance(f, str) or isinstance(f, pathlib.Path):
--> 161 cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
162 else:
163 cpp_module = torch._C.import_ir_module_from_buffer(
RuntimeError: [Error thrown at core/runtime/TRTEngine.cpp:44] Expected most_compatible_device to be true but got false
No compatible device was found for instantiating TensorRT engine
Environment
Explained above
The text was updated successfully, but these errors were encountered:
In general TensorRT engines are not portable across different GPU architectures. If you need to support both Maxwell and Pascal you need to create separate engines for each. Also even between different variants of a GPU architecture this is not really recommended as you could see degraded performance. The best practice is to compile your model on the exact hardware you are looking to deploy on.
Thanks for the reply.
Currently I have a model that, without torch_tensorRT optimization consumes about 8GB GPU memory, but after tensorRT compilation, it goes down to about 700MB (for Tesla P100, so I guess this number would be different if it were to be done for maxwell?).
So originally my thought process was to compile the TensorRT model using the bigger GPU (i.e. 16GB Tesla P100) and load the serialized model on Quadro M1000M(2GB) and run it there. But now I know that idea doesn't work.
I suppose there is no way to compile the 8GB model on Quadro M1000M? Or is there some way TensorRT can do this?
so I guess this number would be different if it were to be done for maxwell?
Yes probably.
What is your device memory consumption when you export a model to torchscript then load it from disk in a new process? 8GB -> ~1GB is quite large. Also are you trying to run with torch.no_grad() and eval? These might lower device memory consumption as well.
Uh oh!
There was an error while loading. Please reload this page.
❓ Question
I'm trying to save a serialized tensorRT optimized model using torch_tensorrt from one environment and then load it in another environment (different GPUs. one has Quadro M1000M, and another has Tesla P100.
In both environments I don't have full sudo control where I can install whatever I want (i.e. can't change nvidia driver), but I am able to install different cuda toolkits locally, same with pip installs with wheels.
What you have already tried
I have tried (ones marked with @ are ones I can't change):
env #1 =
@1. Tesla P100
@2. Nvidia driver 460
3. CUDA 11.3 (checked via torch.version.cuda). nvidia-smi shows 11.2. has many cuda versions installed from 10.2 to 11.4
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (conda installed)
env #2 =
@1. Quadro M1000M
@2. Nvidia driver 455
3. CUDA 11.3(checked via torch.version.cuda, backwards compatibilty mode I believe, but technically 11.3 requires 460+ nvidia driver according to the compatibility table). nvidia-smi shows 11.1. has 10.2 version available aside from 11.3 I installed.
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (pip installed)
So as you can see the only difference is really the GPU and the NVIDIA driver (455 vs 460).
Is this supposed to work?
On env#1, I can torch_tensorrt compile any models
On env#2, I run into issues if I try to compile any slightly complex models (i.e. resnet34) where it says:
WARNING: [Torch-TensorRT] - Dilation not used in Max pooling converter
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.5.1
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 1: [wrapper.cpp::plainGemm::197] Error Code 1: Cublas (CUBLAS_STATUS_NOT_SUPPORTED)
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
If I try to "torch.jit.load" any model made in env #1 (even the simplest ones like a model with 1 conv2d layer) on env #2, I get the following error msg:
~/.local/lib/python3.6/site-packages/torch/jit/_serialization.py in load(f, map_location, _extra_files)
159 cu = torch._C.CompilationUnit()
160 if isinstance(f, str) or isinstance(f, pathlib.Path):
--> 161 cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
162 else:
163 cpp_module = torch._C.import_ir_module_from_buffer(
RuntimeError: [Error thrown at core/runtime/TRTEngine.cpp:44] Expected most_compatible_device to be true but got false
No compatible device was found for instantiating TensorRT engine
Environment
Explained above
The text was updated successfully, but these errors were encountered: