Skip to content

add hex methods #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions playground/index-raw.html
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@
<h1>Proposed Support for Base64 in JavaScript</h1>

<h2 id="introduction">Introduction</h2>
<p>This page documents an early-stage proposal for native base64 support in JavaScript, and includes a <strong>non-production</strong> polyfill you can experiment with in the browser's console.</p>
<p>The proposal would provide methods for encoding and decoding Uint8Arrays as base64 strings.</p>
<p>This page documents an early-stage proposal for native base64 and hex encoding and decoding for binary data in JavaScript, and includes a <strong>non-production, slightly inaccurate</strong> polyfill you can experiment with in the browser's console. Some details of the polyfill, particularly around coercion and order of observable effects, are not identical to the proposed spec text.</p>
<p>The proposal would provide methods for encoding and decoding Uint8Arrays as base64 and hex strings.</p>
<p>Feedback on <a href="https://github.com/tc39/proposal-arraybuffer-base64">the proposal's repository</a> is appreciated.</p>

<h2 id="API">API</h2>
Expand All @@ -68,10 +68,25 @@ <h3>Basic usage</h3>
// Uint8Array([72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100])
</code></pre>

<p>Encoding a Uint8Array to a hex string:</p>
<pre class="language-js"><code class="language-js">
let arr = new Uint8Array([72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100]);
console.log(arr.toHex());
// '48656c6c6f20576f726c64='
</code></pre>

<p>Decoding a hex string to a Uint8Array:</p>
<pre class="language-js"><code class="language-js">
let string = '48656c6c6f20576f726c64';
console.log(Uint8Array.fromHex(string));
// Uint8Array([72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100])
</code></pre>

<h3>Options</h3>
<p>Both methods take an optional options bag which allows specifying the alphabet as either "base64" (the default) or "base64url" (<a href="https://datatracker.ietf.org/doc/html/rfc4648#section-5">the URL-safe variant</a>).</p>
<p>The base64 methods take an optional options bag which allows specifying the alphabet as either "base64" (the default) or "base64url" (<a href="https://datatracker.ietf.org/doc/html/rfc4648#section-5">the URL-safe variant</a>).</p>
<p>In the future this may allow specifying arbitrary alphabets.</p>
<p>In later versions of this proposal the options bag may also allow additional options, such as specifying whether to generate / enforce padding characters and how to handle whitespace.</p>
<p>The hex methods do not have any options.</p>

<pre class="language-js"><code class="language-js">
let array = new Uint8Array([251, 255, 191]);
Expand All @@ -84,6 +99,7 @@ <h3>Options</h3>
<h3>Streaming</h3>
<p>Two additional methods, <code>toPartialBase64</code> and <code>fromPartialBase64</code>, allow encoding and decoding chunks of base64. This requires managing state, which is handled by returning a <code>{ result, extra }</code> pair. The options bag for these methods takes two additional arguments, one which specifies whether more data is expected and one which specifies any extra values returned by a previous call.</p>
<p>These methods are intended for lower-level use and are less convenient to use.</p>
<p>Streaming versions of the hex APIs are not included since they are straightforward to do manually.</p>

<p>Streaming an ArrayBuffer into chunks of base64 strings:</p>
<pre class="language-js"><code class="language-js">
Expand Down
25 changes: 25 additions & 0 deletions playground/polyfill-core.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,28 @@ export function base64ToUint8Array(str, alphabetIdentifier = 'base64', more = fa
};
}

export function uint8ArrayToHex(arr) {
checkUint8Array(arr);
let out = '';
for (let i = 0; i < arr.length; ++i) {
out += arr[i].toString(16).padStart(2, '0');
}
return out;
}

export function hexToUint8Array(str) {
if (typeof str !== 'string') {
throw new TypeError('expected str to be a string');
}
if (str.length % 2 !== 0) {
throw new SyntaxError('str should be an even number of characters');
}
if (/[^0-9a-zA-Z]/.test(str)) {
throw new SyntaxError('str should only contain hex characters');
}
let out = new Uint8Array(str.length / 2);
for (let i = 0; i < out.length; ++i) {
out[i] = parseInt(str.slice(i * 2, i * 2 + 2), 16);
}
return out;
}
14 changes: 13 additions & 1 deletion playground/polyfill-install.mjs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { checkUint8Array, uint8ArrayToBase64, base64ToUint8Array } from './polyfill-core.mjs';
import { checkUint8Array, uint8ArrayToBase64, base64ToUint8Array, uint8ArrayToHex, hexToUint8Array } from './polyfill-core.mjs';

Uint8Array.prototype.toBase64 = function (opts) {
checkUint8Array(this);
Expand Down Expand Up @@ -39,3 +39,15 @@ Uint8Array.fromPartialBase64 = function (string, opts) {
}
return base64ToUint8Array(string, alphabet, more, extra);
};

Uint8Array.prototype.toHex = function () {
checkUint8Array(this);
return uint8ArrayToHex(this);
};

Uint8Array.fromHex = function (string) {
if (typeof string !== 'string') {
throw new Error('expected argument to be a string');
}
return hexToUint8Array(string);
};
55 changes: 45 additions & 10 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ <h1>Uint8Array.prototype.toBase64 ( [ _options_ ] )</h1>
</emu-alg>
</emu-clause>

<emu-clause id="sec-uint8array.prototype.topartiabase64">
<emu-clause id="sec-uint8array.prototype.topartialbase64">
<h1>Uint8Array.prototype.toPartialBase64 ( [ _options_ ] )</h1>
<emu-alg>
1. Let _O_ be the *this* value.
Expand Down Expand Up @@ -72,10 +72,24 @@ <h1>Uint8Array.prototype.toPartialBase64 ( [ _options_ ] )</h1>
</emu-alg>
</emu-clause>

<emu-clause id="sec-uint8array.prototype.tohex">
<h1>Uint8Array.prototype.toHex ( )</h1>
<emu-alg>
1. Let _O_ be the *this* value.
1. Let _toEncode_ be ? GetUint8ArrayBytes(_O_).
1. Let _out_ be the empty String.
1. For each byte _byte_ of _toEncode_, do
1. Let _str_ be Number::toString(_byte_, 16).
1. If the length of _str_ is 1, set _str_ to the string-concatenation of *"0"* and _str_.
1. Set _out_ to the string-concatenation of _out_ and _str_.
1. Return _out_.
</emu-alg>
</emu-clause>

<emu-clause id="sec-uint8array.frombase64">
<h1>Uint8Array.fromBase64 ( _string_ [ , _options_ ] )</h1>
<emu-alg>
1. Set _string_ to ? GetStringForBase64(_string_).
1. Set _string_ to ? GetStringForBinaryEncoding(_string_).
1. Set _options_ to ? GetOptionsObject(_options_).
1. Let _alphabet_ be ? Get(_options_, *"alphabet"*).
1. If _alphabet_ is *undefined*, set _alphabet_ to *"base64"*.
Expand All @@ -93,7 +107,7 @@ <h1>Uint8Array.fromBase64 ( _string_ [ , _options_ ] )</h1>
1. If _characters_ cannot result from applying the base64url encoding specified in section 5 of RFC 4648 to some sequence of bytes, throw a *SyntaxError* exception.
1. Let _bytes_ be the unique sequence of bytes such that applying the base64url encoding specified in section 5 of RFC 4648 to that sequence would produce _characters_.
1. Let _resultLength_ be the number of bytes in _bytes_.
1. Let _result_ be ? AllocateTypedArray(*"Uint8Array*", %Uint8Array%, %Uint8Array.prototype%, _resultLength_).
1. Let _result_ be ? AllocateTypedArray(*"Uint8Array"*, %Uint8Array%, %Uint8Array.prototype%, _resultLength_).
1. Set the value at each index of _result_.[[ViewedArrayBuffer]].[[ArrayBufferData]] to the value at the corresponding index of _bytes_.
1. Return _result_.
</emu-alg>
Expand All @@ -102,7 +116,7 @@ <h1>Uint8Array.fromBase64 ( _string_ [ , _options_ ] )</h1>
<emu-clause id="sec-uint8array.frompartialbase64">
<h1>Uint8Array.fromPartialBase64 ( _string_ [ , _options_ ] )</h1>
<emu-alg>
1. Set _string_ to ? GetStringForBase64(_string_).
1. Set _string_ to ? GetStringForBinaryEncoding(_string_).
1. Set _options_ to ? GetOptionsObject(_options_).
1. Let _alphabet_ be ? Get(_options_, *"alphabet"*).
1. If _alphabet_ is *undefined*, set _alphabet_ to *"base64"*.
Expand All @@ -111,7 +125,7 @@ <h1>Uint8Array.fromPartialBase64 ( _string_ [ , _options_ ] )</h1>
1. Let _more_ be ToBoolean(? Get(_options_, *"more"*)).
1. Let _extra_ be ? Get(_options_, *"extra"*).
1. If _extra_ is neither *undefined* nor *null*, then
1. Let _extraString_ be ? GetStringForBase64(_extra_).
1. Let _extraString_ be ? GetStringForBinaryEncoding(_extra_).
1. Set _string_ to the list-concatenation of _extraString_ and _string_.
1. If _more_ is *true*, then
1. TODO: think about handling of padding on _string_ / _extra_ in this case. This currently assumes no padding on either.
Expand All @@ -134,7 +148,7 @@ <h1>Uint8Array.fromPartialBase64 ( _string_ [ , _options_ ] )</h1>
1. If _characters_ cannot result from applying the base64url encoding specified in section 5 of RFC 4648 to some sequence of bytes, throw a *SyntaxError* exception.
1. Let _bytes_ be the unique sequence of bytes such that applying the base64url encoding specified in section 5 of RFC 4648 to that sequence would produce _characters_.
1. Let _resultLength_ be the number of bytes in _bytes_.
1. Let _result_ be ? AllocateTypedArray(*"Uint8Array*", %Uint8Array%, %Uint8Array.prototype%, _resultLength_).
1. Let _result_ be ? AllocateTypedArray(*"Uint8Array"*, %Uint8Array%, %Uint8Array.prototype%, _resultLength_).
1. Set the value at each index of _result_.[[ViewedArrayBuffer]].[[ArrayBufferData]] to the value at the corresponding index of _bytes_.
1. Let _obj_ be OrdinaryObjectCreate(%Object.prototype%).
1. Perform ! CreateDataPropertyOrThrow(_obj_, *"result"*, _result_).
Expand All @@ -143,6 +157,27 @@ <h1>Uint8Array.fromPartialBase64 ( _string_ [ , _options_ ] )</h1>
</emu-alg>
</emu-clause>

<emu-clause id="sec-uint8array.fromhex">
<h1>Uint8Array.fromHex ( _string_ )</h1>
<emu-alg>
1. Set _string_ to ? GetStringForBinaryEncoding(_string_).
1. TODO: consider stripping whitespace here.
1. Let _stringLen_ be the length of _string_.
1. If _stringLen_ modulo 2 is not 0, throw a *SyntaxError* exception.
1. If _string_ contains any code units which are not in *"0123456789abcdefABCDEF"*, throw a *SyntaxError* exception.
1. Let _resultLength_ be _stringLen_ / 2.
1. Let _result_ be ? AllocateTypedArray(*"Uint8Array"*, %Uint8Array%, %Uint8Array.prototype%, _resultLength_).
1. Let _index_ be 0.
1. Repeat, while _index_ &lt; _resultLength_,
1. Let _stringIndex_ be _index_ * 2.
1. Let _hexits_ be the substring of _string_ from _stringIndex_ to _stringIndex_ + 2.
1. Let _byte_ be the integer value represented by _hexits_ in base-16 notation, using the letters A-F and a-f for digits with values 10 through 15.
1. Perform ! IntegerIndexedElementSet(_result_, 𝔽(_index_), 𝔽(_byte_)).
1. Set _index_ to _index_ + 1.
1. Return _result_.
</emu-alg>
</emu-clause>

<emu-clause id="sec-getuint8arraybytes" type="abstract operation">
<h1>
GetUint8ArrayBytes (
Expand All @@ -162,18 +197,18 @@ <h1>
</emu-alg>
</emu-clause>

<emu-clause id="sec-getstringforbase64" type="abstract operation">
<emu-clause id="sec-getstringforbinaryencoding" type="abstract operation">
<h1>
GetStringForBase64 (
GetStringForBinaryEncoding (
_arg_: an ECMAScript language value,
): either a normal completion containing a String or a throw completion
</h1>
<dl class="header"></dl>
<emu-alg>
1. If _arg_ is an Object, return ? ToString(_arg_).
1. NOTE: Because `[` is not a valid base64 character, the Strings returned by %Object.prototype.toString% will produce a SyntaxError below. Implementations are encouraged to provide an informative error message in that situations.
1. NOTE: Because `[` is not a valid base64 or hex character, the Strings returned by %Object.prototype.toString% will produce a SyntaxError during encoding. Implementations are encouraged to provide an informative error message in that situations.
1. Else if _arg_ is not a String, throw a TypeError exception.
1. NOTE: The above step is included to prevent errors such as accidentally passing `null` and receiving a Uint8Array containing the bytes « 0x9e, 0xe9, 0x65 ».
1. NOTE: The above step is included to prevent errors such as accidentally passing `null` to `fromBase64` and receiving a Uint8Array containing the bytes « 0x9e, 0xe9, 0x65 ».
1. Return _arg_.
</emu-alg>
</emu-clause>
Expand Down