Removed upb_Decoder's stateful, bespoke logic for field lookup. #21707

copybara-service · 2025-05-12T01:39:07Z

Removed upb_Decoder's stateful, bespoke logic for field lookup.

upb_Decoder used to have its own logic for finding a MiniTable field by number. Instead of using the standard upb_MiniTable_FindFieldByNumber() function, it used a custom linear search that cached the most recently seen field index. The theory behind this design was that field numbers are usually serialized in increasing order, so by linear searching from the last seen index, we would have a small expected number of elements to inspect before finding the matching one.

In practice, it appears that we can use the generic searching logic with no loss in performance -- in fact, performance increases by as much as 6% (see attached benchmark results).

Removing the stateful search has several benefits:

This simplifies the decoder since we no longer have to preserve this extra state (last_field_index). Note that it was not even possible to put this variable into upb_Decoder because it is preserved per parsed message, so we previously needed a stack of last_field_index values. This CL lets us get rid of that completely.
This makes parsing performance less sensitive to whether field numbers are serialized in order.
This simplification will help as we refine the dispatch logic that switches between the standard and fasttable decoders.

upb_Decoder used to have its own logic for finding a MiniTable field by number. Instead of using the standard `upb_MiniTable_FindFieldByNumber()` function, it used a custom linear search that cached the most recently seen field index. The theory behind this design was that field numbers are usually serialized in increasing order, so by linear searching from the last seen index, we would have a small expected number of elements to inspect before finding the matching one. In practice, it appears that we can use the generic searching logic with no loss in performance -- in fact, performance increases by as much as 6% (see attached benchmark results). Removing the stateful search has several benefits: 1. This simplifies the decoder since we no longer have to preserve this extra state (`last_field_index`). Note that it was not even possible to put this variable into `upb_Decoder` because it is preserved per parsed message, so we previously needed a *stack* of `last_field_index` values. This CL lets us get rid of that completely. 2. This makes parsing performance less sensitive to whether field numbers are serialized in order. 3. This simplification will help as we refine the dispatch logic that switches between the standard and fasttable decoders. PiperOrigin-RevId: 758304268

copybara-service bot force-pushed the test_757333347 branch 2 times, most recently from ac8144f to dcfee8e Compare May 13, 2025 18:00

copybara-service bot force-pushed the test_757333347 branch from dcfee8e to 1c088d8 Compare May 13, 2025 18:33

copybara-service bot merged commit 1c088d8 into main May 13, 2025

copybara-service bot deleted the test_757333347 branch May 13, 2025 18:33

github-pages bot temporarily deployed to github-pages May 13, 2025 18:34 Inactive

protobuf-team-bot temporarily deployed to github-pages May 13, 2025 18:44 — with GitHub Pages Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removed upb_Decoder's stateful, bespoke logic for field lookup. #21707

Removed upb_Decoder's stateful, bespoke logic for field lookup. #21707

copybara-service bot commented May 12, 2025 •

edited

Loading

Removed upb_Decoder's stateful, bespoke logic for field lookup. #21707

Removed upb_Decoder's stateful, bespoke logic for field lookup. #21707

Conversation

copybara-service bot commented May 12, 2025 • edited Loading

copybara-service bot commented May 12, 2025 •

edited

Loading