subscriber: use state machine to parse `EnvFilter` directives #3243

djc · 2025-03-25T20:54:26Z

Intended to fix #3174. Have not done any benchmarks yet -- suggestions on how best to approach that are welcome.

This is a little bit more code, but IMO maybe slightly more readable?

It's correct enough that it passes the tests, there might be edge cases that aren't covered but those should be easy to solve.

tracing-subscriber/src/filter/env/directive.rs

dpc · 2025-03-26T00:57:25Z

> hyperfine --warmup 30 -N ./target/release/dotr-before ./target/release/dotr-after
Benchmark 1: ./target/release/dotr-before
  Time (mean ± σ):       1.2 ms ±   0.1 ms    [User: 0.4 ms, System: 0.7 ms]
  Range (min … max):     1.1 ms …   1.9 ms    2044 runs

Benchmark 2: ./target/release/dotr-after
  Time (mean ± σ):     613.8 µs ±  79.6 µs    [User: 289.1 µs, System: 254.0 µs]
  Range (min … max):   511.5 µs … 979.8 µs    5264 runs

Summary
  ./target/release/dotr-after ran
    2.03 ± 0.34 times faster than ./target/release/dotr-before

Added true just to have a baseline:

Benchmark 1: ./target/release/dotr-before
  Time (mean ± σ):       1.2 ms ±   0.1 ms    [User: 0.4 ms, System: 0.7 ms]
  Range (min … max):     1.1 ms …   1.8 ms    2535 runs

Benchmark 2: ./target/release/dotr-after
  Time (mean ± σ):     611.7 µs ±  80.3 µs    [User: 295.6 µs, System: 244.1 µs]
  Range (min … max):   518.8 µs … 1663.5 µs    3825 runs

Benchmark 3: true
  Time (mean ± σ):     595.8 µs ±  74.1 µs    [User: 311.4 µs, System: 215.4 µs]
  Range (min … max):   512.3 µs … 1035.5 µs    5362 runs

This app sets RUST_LOG internally, so setting it from the outside doesn't make a difference.

AFAICT you fixed it and it's nearly free now.

klensy · 2025-03-27T12:07:20Z

After this, regex can be removed from Cargo.toml (for tracing-subscriber), only left in dev-deps?

djc · 2025-03-27T12:21:16Z

After this, regex can be removed from Cargo.toml (for tracing-subscriber), only left in dev-deps?

Nice catch, yes! Added that into the first commit.

tracing-subscriber/src/filter/env/directive.rs

hds

Could you please rebase this against master. We merge everything in there first and then David or I will handle backporting to v0.1.x.

Before making this change, I think we need more tests to ensure that the behavior isn't changing. Especially because there are things I'm seeing in the code that are different to what the docs say.

I'm happy to have a look at writing some of those tests, but it may take me a little bit to get to it.

tracing-subscriber/src/filter/env/directive.rs

With a view to replacing the env filter parsing in #3243, this change adds some additional tests to improve our confidence in not breaking existing behavior. Tests for empty and invalid directives is added, as well as tests for directives overriding less specific directives. One of the latter tests (`more_specific_dynamic_filter_less_verbose`) currently fails, which is a known issue reported in #1388. The test is in place with `#[should_panic]` and can be reverted to a normal test when that behavior is fixed. The documentation on `parse`, `parse_lossy`, `from_env_lossy` and `try_from_env` has also been made more explicit as to the result of an empty directive (the default directive is used). This was already documented on `with_default_directive`.

tracing-subscriber/src/filter/env/directive.rs

djc · 2025-04-30T16:27:24Z

tracing-subscriber/src/filter/env/directive.rs

+        for (i, c) in from.trim().char_indices() {
+            state = match (state, c) {
+                (Start, '[') => Span { span_start: i + 1 },
+                (Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()),


I added this arm to pass the new test cases from #3262. This seems like a reasonable way to weed out invalid stuff while allowing non-ASCII target names?

I think we need to match the previous regex here, and is_alphanumeric is missing a couple of characters.

In the current implementation, a target accepts [\w:-] so as well as : and -, we need everything specified by \w, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control}).

Is there some way we can get those character classes in Rust?

Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.

I added a !['-', ':', '_'] clause, which covers the : and - that were there explicitly plus _ as the character from \p{Pc} that seems like something we want to support. Of course \p{Alphabetic} and \d are covered via is_alphanumeric(). I don't think there's a straightforward way to support check for the other properties, but it also seems pretty unlikely these will cause problems.

djc · 2025-04-30T16:28:22Z

Rebased on master
Added back comment
Added additional arm to pass new test cases from subscriber: increase EnvFilter test coverage #3262

hds · 2025-05-02T15:23:32Z

tracing-subscriber/src/filter/env/directive.rs

+        for (i, c) in from.trim().char_indices() {
+            state = match (state, c) {
+                (Start, '[') => Span { span_start: i + 1 },
+                (Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()),


I think we need to match the previous regex here, and is_alphanumeric is missing a couple of characters.

In the current implementation, a target accepts [\w:-] so as well as : and -, we need everything specified by \w, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control}).

Is there some way we can get those character classes in Rust?

Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.

hds

Thank you for all your work on this implementation!

djc · 2025-05-08T10:10:42Z

Thank you for all your work on this implementation!

Thank you for the careful reviews!

There is a report in #3174 that even in release mode, building the regex used to parse `EnvFilter` directives can take a relatively large amount of time (600us). This change replaces the `regex` based parsing of the directives with a state machine implementation that is faster and also easier to reason about. Fixes: #3174

djc requested review from hawkw and a team as code owners March 25, 2025 20:54

djc mentioned this pull request Mar 25, 2025

tracing-subscriber's Directive parsing adds 600us to startup time #3174

Closed

dpc reviewed Mar 26, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

djc force-pushed the parse-directive branch from 12445d7 to b298124 Compare March 27, 2025 12:21

joshka reviewed Mar 29, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

djc force-pushed the parse-directive branch 2 times, most recently from d533cdf to a4484c3 Compare March 29, 2025 10:38

hds requested changes Apr 11, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Show resolved Hide resolved

tracing-subscriber/src/filter/env/directive.rs Show resolved Hide resolved

hds mentioned this pull request Apr 30, 2025

subscriber: increase EnvFilter test coverage #3262

Merged

djc force-pushed the parse-directive branch from a4484c3 to bab5257 Compare April 30, 2025 16:07

djc requested a review from yaahc as a code owner April 30, 2025 16:07

djc changed the base branch from v0.1.x to master April 30, 2025 16:07

djc requested a review from davidbarsky as a code owner April 30, 2025 16:07

djc force-pushed the parse-directive branch from bab5257 to 2007997 Compare April 30, 2025 16:21

djc commented Apr 30, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

djc force-pushed the parse-directive branch from 2007997 to 06e5695 Compare April 30, 2025 16:26

djc commented Apr 30, 2025

View reviewed changes

djc requested a review from hds April 30, 2025 16:28

hds requested changes May 2, 2025

View reviewed changes

Use state machine to parse directives

d2d99b6

djc force-pushed the parse-directive branch from 06e5695 to d2d99b6 Compare May 5, 2025 13:16

hds approved these changes May 8, 2025

View reviewed changes

hds changed the title ~~Use state machine to parse directives~~ subscriber: use state machine to parse EnvFilter directives May 8, 2025

hds merged commit 28561e0 into master May 8, 2025
57 checks passed

hds deleted the parse-directive branch May 8, 2025 09:51

hds mentioned this pull request May 21, 2025

Backport everything up to May 2025 #3283

Draft

21 tasks

Kobzol mentioned this pull request May 22, 2025

improve rustc wrapper startup time? rust-lang/rustup#2626

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

subscriber: use state machine to parse `EnvFilter` directives #3243

subscriber: use state machine to parse `EnvFilter` directives #3243

Uh oh!

djc commented Mar 25, 2025

Uh oh!

Uh oh!

dpc commented Mar 26, 2025 •

edited

Loading

Uh oh!

klensy commented Mar 27, 2025 •

edited

Loading

Uh oh!

djc commented Mar 27, 2025

Uh oh!

Uh oh!

hds left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

djc Apr 30, 2025

Uh oh!

hds May 2, 2025

Uh oh!

djc May 5, 2025

Uh oh!

djc commented Apr 30, 2025

Uh oh!

hds May 2, 2025

Uh oh!

hds left a comment

Uh oh!

Uh oh!

djc commented May 8, 2025

Uh oh!

Uh oh!

subscriber: use state machine to parse EnvFilter directives #3243

subscriber: use state machine to parse EnvFilter directives #3243

Uh oh!

Conversation

djc commented Mar 25, 2025

Uh oh!

Uh oh!

dpc commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klensy commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djc commented Mar 27, 2025

Uh oh!

Uh oh!

hds left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

djc Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

hds May 2, 2025

Choose a reason for hiding this comment

Uh oh!

djc May 5, 2025

Choose a reason for hiding this comment

Uh oh!

djc commented Apr 30, 2025

Uh oh!

hds May 2, 2025

Choose a reason for hiding this comment

Uh oh!

hds left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

djc commented May 8, 2025

Uh oh!

Uh oh!

subscriber: use state machine to parse `EnvFilter` directives #3243

subscriber: use state machine to parse `EnvFilter` directives #3243

dpc commented Mar 26, 2025 •

edited

Loading

klensy commented Mar 27, 2025 •

edited

Loading