-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add warnings about UTF-16 vs UTF-8 strings #1416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warnings about UTF-16 vs UTF-8 strings #1416
Conversation
588a54c
to
c8fdfac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I also still think as_string
should be changed to into
)
c8fdfac
to
d212fac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few nits, this is looking really good! I like how this turned out.
d212fac
to
2452d25
Compare
👍 Thanks for the thorough review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple more nits, and then this is ready to merge!
This commit aims to address rustwasm#1348 via a number of strategies: * Documentation is updated to warn about UTF-16 vs UTF-8 problems between JS and Rust. Notably documenting that `as_string` and handling of arguments is lossy when there are lone surrogates. * A `JsString::is_valid_utf16` method was added to test whether `as_string` is lossless or not. The intention is that most default behavior of `wasm-bindgen` will remain, but where necessary bindings will use `JsString` instead of `str`/`String` and will manually check for `is_valid_utf16` as necessary. It's also hypothesized that this is relatively rare and not too performance critical, so an optimized intrinsic for `is_valid_utf16` is not yet provided. Closes rustwasm#1348
2452d25
to
44738e0
Compare
Thanks @Pauan! |
This commit aims to address #1348 via a number of strategies:
Documentation is updated to warn about UTF-16 vs UTF-8 problems
between JS and Rust. Notably documenting that
as_string
and handlingof arguments is lossy when there are lone surrogates.
A
JsString::is_valid_utf16
method was added to test whetheras_string
is lossless or not.The intention is that most default behavior of
wasm-bindgen
willremain, but where necessary bindings will use
JsString
instead ofstr
/String
and will manually check foris_valid_utf16
asnecessary. It's also hypothesized that this is relatively rare and not
too performance critical, so an optimized intrinsic for
is_valid_utf16
is not yet provided.
Closes #1348