Torch 1.6.0 update #166

narendasan · 2020-08-06T00:05:31Z

Description

Updates the compiler for PyTorch 1.6.0. Breaking change: drops support for Python 3.5. Known issue: Bug with PyTorch some int[] are not parsable with the IR parsing tools. Issue has been raised with PyTorch team. Shuffle case fails due to this issue. Solve issues with segfaults during cuDNN clean up.

Fixes #1

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation and have regenerated the documentation (make html in docsrc)
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

…ch_20.06_container

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

BREAKING CHANGE: Support for Python 3.5 is being dropped with this update Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

xsacha · 2020-08-06T13:43:59Z

~~Just tried to build it (Windows) and got:~~

execution.lo.lib(TRTEngine.obj) : error LNK2019: unresolved external symbol createInferRuntime_INTERNAL referenced in function "public: __cdecl trtorch::core::execution::TRTEngine::TRTEngine(class std::basic_string<char,struct std::char_traits,class std::allocator >,class std::basic_string<char,struct std::char_traits,class std::allocator >)" (??0TRTEngine@execution@core@trtorch@@qeaa@V?$basic_string@DU?$char_traits@D@std@@v?$allocator@D@2@@std@@0@Z)
conversionctx.lib(ConversionCtx.obj) : error LNK2019: unresolved external symbol createInferBuilder_INTERNAL referenced in function "public: __cdecl trtorch::core::conversion::ConversionCtx::ConversionCtx(struct trtorch::core::conversion::BuilderSettings)" (??0ConversionCtx@conversion@core@trtorch@@qeaa@UBuilderSettings@123@@z)
~~bazel-out\x64_windows-opt\bin\cpp\api\lib\libtrtorch.so : fatal error LNK1120: 3 unresolved externals~~

Edit: Sorry, unrelated to this PR.
It was this commit: 858d8c3#diff-e0ac18efc84fa06bf6e9b694d57f68adL75

Here's this PR built for Windows:
trtorch-PR.zip
trtorch-PR-debug.zip

Unfortunately, the same bug still occurs that was happening prior to this PR.
While calling torch::jit::parseSchema on:
trt::execute_engine(Tensor[] inputs, __torch__.torch.classes.tensorrt.Engine engine) -> Tensor[]

Last 3 lines in debug console are:
DEBUG: [TRTorch - Debug Build] - Registering evaluator for prim::unchecked_cast
DEBUG: [TRTorch - Debug Build] - Registering evaluator for prim::Uninitialized
DEBUG: [TRTorch - Debug Build] - Registering evaluator for prim::RaiseException

narendasan · 2020-08-06T20:22:23Z

We can add back the change, but we removed it because it broke linux builds. Maybe we can just have a default condition with an empty list

platform friendly way Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

xsacha · 2020-08-06T23:46:25Z

Oh yeah, I've added it back to compile it, but the other issue below remain. It's the issue in #153 so probably Windows only.

Unknown custom class type tensorrt.Engine. Please ensure it is registered.:
trt::execute_engine(Tensor[] inputs, torch.torch.classes.tensorrt.Engine engine) -> Tensor[]
                                                                  ~~~~~~ <--- HERE

If I ignore this exception, it seems to correctly compile the graph but then when executing it, I get:

ERROR: [TRTorch Conversion Context] - %input.61 : Tensor = aten::prelu(%input.59, %self.input_layer.2.weight) # /home/sacha/.local/lib/python3.8/site-packages/torch/nn/functional.py:1263:0: slope tensor must be unidirectional broadcastable to input tensor
DEBUG: [TRTorch - Debug Build] - momentum disregarded
DEBUG: [TRTorch - Debug Build] - training disregarded
DEBUG: [TRTorch - Debug Build] - cudnn disregarded
DEBUG: [TRTorch - Debug Build] - Input shape is less than 4D got: [], inserting shuffle layer to reshape to 4D tensor shape: [1, 1, 1, 1]
DEBUG: [TRTorch - Debug Build] - Weights: [64]
    Number of input maps: 64
    Number of output maps: 64
    Element shape: [1]
DEBUG: [TRTorch - Debug Build] - Weights: [64]
    Number of input maps: 64
    Number of output maps: 64
    Element shape: [1]
ERROR: [TRTorch Conversion Context] - %input.61 : Tensor = aten::prelu(%input.59, %self.input_layer.2.weight) # /home/sacha/.local/lib/python3.8/site-packages/torch/nn/functional.py:1263:0: slope tensor must be unidirectional broadcastable to input tensor
ERROR: [TRTorch Conversion Context] - Parameter check failed at: Network.cpp::nvinfer1::Network::addScaleNd::737, condition: nbSpatialDims == 2 || nbSpatialDims == 3

That appears to be a aten::prelu issue, so I tried it on a model that doesn't have prelu and it got much further past this.
It seems to run the model and then ended up with this error:

0 INTERNAL ASSERT FAILED at "..\\..\\torch\\csrc\\jit\\ir\\alias_analysis.cpp":465, please report a bug to PyTorch. We don't have an op for trt::execute_engine but it isn't a special case.  Argument types: Tensor[], __torch__.torch.classes.tensorrt.Engine,
Exception raised from analyzeImpl at ..\..\torch\csrc\jit\ir\alias_analysis.cpp:465 (most recent call first):

narendasan · 2020-08-07T18:38:50Z

Hmm, seems like that error is actually caused by some issue earlier in the compilation process. I see

DEBUG: [TRTorch - Debug Build] - Input shape is less than 4D got: [], inserting shuffle layer to reshape to 4D tensor shape: [1, 1, 1, 1]

Which is odd for the input to prelu. That would explain why you cannot broadcast here

xsacha · 2020-08-10T23:30:44Z

The issues I'm facing do not seem to be related to this PR in any case.

…_1.6.0_update

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

PyTorch container Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

BREAKING CHANGE: Version is being bumped to version 0.1.0a0 to target PyTorch 1.6.0 Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan and others added 7 commits July 24, 2020 11:47

refactor(//core/lowering): Update lowering for PyTorch 1.6.0

378690b

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//tests): Add stride to complete tensors

af5d28e

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Merge branch 'master' of https://github.com/NVIDIA/TRTorch into pytor…

02299b7

…ch_20.06_container

fix(//tests/core/converters/activations): Complete tensors in prelu test

0e90f78

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//core/conversion/evaluator): Custom to IValue that handles int[]

68c934a

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

refactor: Move graph parameters to use IValues instead of just Tensors

771b615

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

chore!: Update dependencies to PyTorch 1.6.0

8eda27d

BREAKING CHANGE: Support for Python 3.5 is being dropped with this update Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan added this to the v0.1.0 milestone Aug 6, 2020

feat(//third_party/tensorrt): Add back TensorRT static lib in a cross

d3c2e7e

platform friendly way Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan and others added 4 commits August 24, 2020 09:20

Merge branch 'master' of https://github.com/NVIDIA/TRTorch into torch…

cc32c22

…_1.6.0_update

feat(//docker): Adding CUDA11 based container for Ampere support

970d775

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//docker): Workaround only shared libraries being available in

50c7eda

PyTorch container Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

chore!: Bumping version numbers to 0.1.0

b84c90b

BREAKING CHANGE: Version is being bumped to version 0.1.0a0 to target PyTorch 1.6.0 Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

github-actions bot added the component: api [C++] Issues re: C++ API label Aug 25, 2020

refactor(//docker): Move sources in docker container around

253f55a

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan merged commit 809f9b3 into master Aug 26, 2020

narendasan deleted the torch_1.6.0_update branch August 26, 2020 01:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Torch 1.6.0 update #166

Torch 1.6.0 update #166

Uh oh!

narendasan commented Aug 6, 2020

Uh oh!

xsacha commented Aug 6, 2020 •

edited

Loading

Uh oh!

narendasan commented Aug 6, 2020 •

edited

Loading

Uh oh!

xsacha commented Aug 6, 2020 •

edited

Loading

Uh oh!

narendasan commented Aug 7, 2020

Uh oh!

xsacha commented Aug 10, 2020

Uh oh!

Uh oh!

Torch 1.6.0 update #166

Torch 1.6.0 update #166

Uh oh!

Conversation

narendasan commented Aug 6, 2020

Description

Type of change

Checklist:

Uh oh!

xsacha commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

narendasan commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xsacha commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

narendasan commented Aug 7, 2020

Uh oh!

xsacha commented Aug 10, 2020

Uh oh!

Uh oh!

xsacha commented Aug 6, 2020 •

edited

Loading

narendasan commented Aug 6, 2020 •

edited

Loading

xsacha commented Aug 6, 2020 •

edited

Loading