Skip to content

Commit 0d5360f

Browse files
committed
Auto merge of #67143 - DarkKirb:master, r=<try>
Optimize is_ascii_digit() and is_ascii_hexdigit() The current implementation of these two functions does redundant checks that are currently not optimized away by LLVM. The is_digit() method on char only matches ASCII digits. It does not do the extra check of whether the character is in ASCII range. This change optimizes the code in both debug and release mode on various major architectures. It removes all conditional branches on x86 and all conditional branches and returns on ARM. Pseudocode version of LLVM's optimized output: old version: ```rust if c >= 128 { return false; } c -= 48; if c > 9 { return false; } true ``` new version: ```rust c -= 48; if c < 10 { return true; } false ``` The is_ascii_hexdigit() change similarly shortens the emitted code in release mode. The optimized version uses bit manipulation to remove one comparison and conditional branch. It would be possible to add a similar change to the is_digit() code, but that has not been done yet. Godbolt comparison between the two implementations: https://godbolt.org/z/gDwfSF
2 parents e862c01 + 34f89fd commit 0d5360f

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

src/libcore/char/methods.rs

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1210,7 +1210,7 @@ impl char {
12101210
#[stable(feature = "ascii_ctype_on_intrinsics", since = "1.24.0")]
12111211
#[inline]
12121212
pub fn is_ascii_digit(&self) -> bool {
1213-
self.is_ascii() && (*self as u8).is_ascii_digit()
1213+
self.is_digit(10)
12141214
}
12151215

12161216
/// Checks if the value is an ASCII hexadecimal digit:
@@ -1245,7 +1245,14 @@ impl char {
12451245
#[stable(feature = "ascii_ctype_on_intrinsics", since = "1.24.0")]
12461246
#[inline]
12471247
pub fn is_ascii_hexdigit(&self) -> bool {
1248-
self.is_ascii() && (*self as u8).is_ascii_hexdigit()
1248+
if !self.is_ascii() {
1249+
return false;
1250+
}
1251+
if self.is_digit(10) {
1252+
return true;
1253+
}
1254+
let code = (*self as u8) & !0x20; // 0x20 is the case bit
1255+
code >= b'A' && code <= b'F'
12491256
}
12501257

12511258
/// Checks if the value is an ASCII punctuation character:

0 commit comments

Comments
 (0)