Skip to content

Commit 85ac385

Browse files
committed
compile: add fast path for c_char
This fixes a pretty bad performance bug in the NFA compiler. In particular, c_char was implemented by diverting to c_class, which is correct, but rather costly to do for every single character in a regex. This causes way more things than necessary to go through the class compilation infrastructure, which includes the suffix caching. We fix this by just special casing c_char. This speeds up regex compilation in #657 by around 30%. Fixes #657
1 parent 3ff6ae1 commit 85ac385

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

src/compile.rs

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -391,7 +391,20 @@ impl Compiler {
391391
}
392392

393393
fn c_char(&mut self, c: char) -> Result {
394-
self.c_class(&[hir::ClassUnicodeRange::new(c, c)])
394+
if self.compiled.uses_bytes() {
395+
if c.is_ascii() {
396+
let b = c as u8;
397+
let hole =
398+
self.push_hole(InstHole::Bytes { start: b, end: b });
399+
self.byte_classes.set_range(b, b);
400+
Ok(Patch { hole, entry: self.insts.len() - 1 })
401+
} else {
402+
self.c_class(&[hir::ClassUnicodeRange::new(c, c)])
403+
}
404+
} else {
405+
let hole = self.push_hole(InstHole::Char { c: c });
406+
Ok(Patch { hole, entry: self.insts.len() - 1 })
407+
}
395408
}
396409

397410
fn c_class(&mut self, ranges: &[hir::ClassUnicodeRange]) -> Result {

0 commit comments

Comments
 (0)