Description
Hi
Thank you for the great work you're doing on TensorRT-LLM and the Triton backend. I have some questions on matching versions between the tensorrt-llm python package, the backend, and the NGC images.
It's unclear to me what images need to be used. In the README and in the Dockerfile, I find different values (either 23.10, 24.01 or 24.02) - but the NGC version of the Triton images is now 24.03. Can you clarify what image tag is correct?
Second, I want to build the tensorrt-llm python package in the same version that was used for the backend in the NGC image. Where can I find what git commit hash of this repo corresponds to the tensorrt-llm backend shipped in the latest NGC Triton image?
The reason I want to use the same versions is to avoid incompatibility issues between ops, TensorRT, etc.
As a request, would you consider providing the built tensorrt_llm python package as part of the Triton TensorRT-LLM image on NGC? This would simplify things greatly, as users could then use the same image to build engines and to deploy them, ensuring compatibility.
Thanks!