❓ [Question] Trying to find compatible versions between two different environments #843

hanbrianlee · 2022-02-01T19:33:31Z

❓ Question

I'm trying to save a serialized tensorRT optimized model using torch_tensorrt from one environment and then load it in another environment (different GPUs. one has Quadro M1000M, and another has Tesla P100.

In both environments I don't have full sudo control where I can install whatever I want (i.e. can't change nvidia driver), but I am able to install different cuda toolkits locally, same with pip installs with wheels.

What you have already tried

I have tried (ones marked with @ are ones I can't change):
env #1 =
@1. Tesla P100
@2. Nvidia driver 460
3. CUDA 11.3 (checked via torch.version.cuda). nvidia-smi shows 11.2. has many cuda versions installed from 10.2 to 11.4
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (conda installed)

env #2 =
@1. Quadro M1000M
@2. Nvidia driver 455
3. CUDA 11.3(checked via torch.version.cuda, backwards compatibilty mode I believe, but technically 11.3 requires 460+ nvidia driver according to the compatibility table). nvidia-smi shows 11.1. has 10.2 version available aside from 11.3 I installed.
4. CuDNN 8.2.1.32
5. TensorRT 8.2.1.8
6. Torch_TensorRT 1.0.0
7. Pytorch 1.10.1+cu113 (pip installed)

So as you can see the only difference is really the GPU and the NVIDIA driver (455 vs 460).
Is this supposed to work?
On env#1, I can torch_tensorrt compile any models
On env#2, I run into issues if I try to compile any slightly complex models (i.e. resnet34) where it says:
WARNING: [Torch-TensorRT] - Dilation not used in Max pooling converter
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.5.1
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 1: [wrapper.cpp::plainGemm::197] Error Code 1: Cublas (CUBLAS_STATUS_NOT_SUPPORTED)
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )

If I try to "torch.jit.load" any model made in env #1 (even the simplest ones like a model with 1 conv2d layer) on env #2, I get the following error msg:
~/.local/lib/python3.6/site-packages/torch/jit/_serialization.py in load(f, map_location, _extra_files)
159 cu = torch._C.CompilationUnit()
160 if isinstance(f, str) or isinstance(f, pathlib.Path):
--> 161 cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
162 else:
163 cpp_module = torch._C.import_ir_module_from_buffer(

RuntimeError: [Error thrown at core/runtime/TRTEngine.cpp:44] Expected most_compatible_device to be true but got false
No compatible device was found for instantiating TensorRT engine

Environment

Explained above

narendasan · 2022-02-02T04:29:07Z

In general TensorRT engines are not portable across different GPU architectures. If you need to support both Maxwell and Pascal you need to create separate engines for each. Also even between different variants of a GPU architecture this is not really recommended as you could see degraded performance. The best practice is to compile your model on the exact hardware you are looking to deploy on.

hanbrianlee · 2022-02-02T04:37:27Z

Thanks for the reply.
Currently I have a model that, without torch_tensorRT optimization consumes about 8GB GPU memory, but after tensorRT compilation, it goes down to about 700MB (for Tesla P100, so I guess this number would be different if it were to be done for maxwell?).
So originally my thought process was to compile the TensorRT model using the bigger GPU (i.e. 16GB Tesla P100) and load the serialized model on Quadro M1000M(2GB) and run it there. But now I know that idea doesn't work.

I suppose there is no way to compile the 8GB model on Quadro M1000M? Or is there some way TensorRT can do this?

Thanks!

narendasan · 2022-02-07T20:05:13Z

so I guess this number would be different if it were to be done for maxwell?
Yes probably.

What is your device memory consumption when you export a model to torchscript then load it from disk in a new process? 8GB -> ~1GB is quite large. Also are you trying to run with torch.no_grad() and eval? These might lower device memory consumption as well.

github-actions · 2022-05-09T00:02:16Z

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

hanbrianlee added the question Further information is requested label Feb 1, 2022

github-actions bot added the No Activity label May 9, 2022

github-actions bot closed this as completed May 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

❓ [Question] Trying to find compatible versions between two different environments #843

❓ [Question] Trying to find compatible versions between two different environments #843

hanbrianlee commented Feb 1, 2022 •

edited

Loading

narendasan commented Feb 2, 2022

Uh oh!

hanbrianlee commented Feb 2, 2022 •

edited

Loading

Uh oh!

narendasan commented Feb 7, 2022

Uh oh!

github-actions bot commented May 9, 2022

Uh oh!

❓ [Question] Trying to find compatible versions between two different environments #843

❓ [Question] Trying to find compatible versions between two different environments #843

Comments

hanbrianlee commented Feb 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❓ Question

What you have already tried

Environment

narendasan commented Feb 2, 2022

Uh oh!

hanbrianlee commented Feb 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

narendasan commented Feb 7, 2022

Uh oh!

github-actions bot commented May 9, 2022

Uh oh!

hanbrianlee commented Feb 1, 2022 •

edited

Loading

hanbrianlee commented Feb 2, 2022 •

edited

Loading