Skip to content

[wasm-exceptions] Probable memory corruption when doing try...catch...rethrow #22861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ravisumit33 opened this issue Nov 6, 2024 · 8 comments
Assignees

Comments

@ravisumit33
Copy link
Contributor

We were using emsdk 3.1.35 when we encountered a weird memory access out of bound error on latest Safari (Safari 18). The same error wasn't reproducible on chrome. Initially we thought there could be an error in latest Safari. The code in our codebase at which we were getting the error was like below:

#include <iostream>
#include <stdexcept>

int main() {
  try {
    throw std::runtime_error("Error inside main");
  } catch (const std::runtime_error &e) {
    std::cout << "Caught exception: " << e.what() << " and rethrowing"
              << std::endl;
    throw;
  }
  return 0;
}

We tried building above sample code (with emsdk 3.1.35) with address sanitizer using command:

em++ -fwasm-exceptions  -fsanitize=address  a.cpp -o a.html

With address sanitizer builld, this code started giving below error on all browsers (including Chrome):

a.js:10 Uncaught RuntimeError: unreachable
    at a.wasm:0x227e
    at a.wasm:0x61bf
    at a.js:612:20
    at callMain (a.js:4785:13)
    at doRun (a.js:4815:21)
    at a.js:4824:4

We definitely knew there was something wrong in the 3.1.35 emsdk version. We upgraded the emsdk version to 3.1.56 and this error with the address sanitizer in the sample code was gone. Upon bisecting we could find that there was some LLVM update that fixed it.

But when we used the same emscripten version with our actual codebase we were still getting the same error on Safari 18. As of now, we also tried with emscripten tot but the error in our codebase persists. We tried building our codebase with address sanitizer so that it may get reproduced on other browsers as well but no luck. Don't know if something has changed in the address sanitizer in newer emsdk versions which has caused it to not give the same error or any actual fix has been done.

Additionally, if we build our codebase with js exceptions (-sDISABLE_EXCEPTION_CATCHING=0) we do not get any memory access out of bound (memory corruption) issue. This shows that, there is definitely something wrong with wasm-exceptions implementation and above described try...catch...rethrow flow.

Its unfortunate that we do not have a sample code that is reproducing the issue on latest emscripten. But we have some information that can help investigation get started.

@aheejin
Copy link
Member

aheejin commented Nov 6, 2024

We definitely knew there was something wrong in the 3.1.35 emsdk version. We upgraded the emsdk version to 3.1.56 and this error with the address sanitizer in the sample code was gone. Upon bisecting we could find that there was some LLVM update that fixed it.

Can you point to what that commit was?

But when we used the same emscripten version with our actual codebase we were still getting the same error on Safari 18. As of now, we also tried with emscripten tot but the error in our codebase persists. We tried building our codebase with address sanitizer so that it may get reproduced on other browsers as well but no luck.

So does this error with tot only happen with the address sanitizer or even without? What's the error message like?

Anyway some kind of reproducer would be really helpful. If it's hard to provide source code, can you provide binary that errors out when run, with how to run the binary?

@ravisumit33
Copy link
Contributor Author

Can you point to what that commit was?

8d482fb98ab75d01a21de8137407d74adb10c8d6 is the commit in the emscripten-releases repo

So does this error with tot only happen with the address sanitizer or even without? What's the error message like?

On tot error happen on both cases (without and with address sanitizer) but only on Safari 18. Earlier on 3.1.35, the scenario was like below:

  • With address sanitizer, it used to happen on all browsers (Safari, Chrome, etc.).
  • Without address sanitizer, it used to happen only on Safari 18 (not even on Safari 17).

I assume some memory corruption/undefined behaviour is going inside.

Anyway some kind of reproducer would be really helpful. If it's hard to provide source code, can you provide binary that errors out when run, with how to run the binary?

Unfortunately, I also can't provide the binary and the steps to use it, due to company policy. I had created the earlier reproducer but somehow it is not replicating the issue on tot.

@kleisauke
Copy link
Collaborator

Sounds like a similar issue as llvm/llvm-project#114600, which was fixed via commit 49fc7ccaa7cda3352bd7e2937d9f7f50e5417cc6 in the emscripten-releases repo.

@ravisumit33
Copy link
Contributor Author

I tried installing above commit using emsdk install 49fc7ccaa7cda3352bd7e2937d9f7f50e5417cc6 just to check if it fixes the issue in my codebase. But, unfortunately, issue is still there. I can see the top function on the error callstack as _Unwind_CallPersonality, if that can help.

@kleisauke
Copy link
Collaborator

Thanks for checking.

Can you point to what that commit was?

8d482fb98ab75d01a21de8137407d74adb10c8d6 is the commit in the emscripten-releases repo

For reference, here's the associated revision changeset for this roll:
llvm/llvm-project@e0d4906...1cf428a

The only WebAssembly-related change within this set is llvm/llvm-project@dbca8aa.

@ravisumit33
Copy link
Contributor Author

@aheejin @sbc100 This is totally breaking our app on Safari 18. Once this error (memory access out of bound) comes, other calls are also giving some weird memory corruption errors. So, nothing works after that moment. As mentioned by @kleisauke , llvm/llvm-project#114600 looks very similar and recent. Can their be a possibility that still there exist some cases to be fixed in the try...catch...rethrow workflow?

@aheejin
Copy link
Member

aheejin commented Nov 11, 2024

Thanks for checking.

Can you point to what that commit was?

8d482fb98ab75d01a21de8137407d74adb10c8d6 is the commit in the emscripten-releases repo

For reference, here's the associated revision changeset for this roll: llvm/[email protected]

The only WebAssembly-related change within this set is llvm/llvm-project@dbca8aa.

This removes nondeterminism and is not about memory corruption.

@aheejin
Copy link
Member

aheejin commented Nov 11, 2024

@aheejin @sbc100 This is totally breaking our app on Safari 18. Once this error (memory access out of bound) comes, other calls are also giving some weird memory corruption errors. So, nothing works after that moment. As mentioned by @kleisauke , llvm/llvm-project#114600 looks very similar and recent. Can their be a possibility that still there exist some cases to be fixed in the try...catch...rethrow workflow?

Sorry but given that we don't have a reproducer I don't have any guess on what this could be. If your company policy prohibits sharing the binary in the open forum, can you possibly share it privately? Github provides a way to collaborate on a private repo, or you can share it in a Google drive.

And what makes you think this is related to rethrow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants