Skip to content

Scalable Polonius support on nightly #118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
10 of 18 tasks
nikomatsakis opened this issue Jul 22, 2024 · 11 comments
Open
10 of 18 tasks

Scalable Polonius support on nightly #118

nikomatsakis opened this issue Jul 22, 2024 · 11 comments
Assignees
Milestone

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jul 22, 2024

Metadata
Point of contact @lqd
Team(s) types
Goal document 2025h1/Polonius

Summary

Keep working on implementing a native rustc version of the Polonius next generation borrow checking algorithm, that would scale better than the previous datalog implementation, continuing from the 2024h2 goal.

Tasks and status

  • Design review (@nikomatsakis)
  • Factoring out higher-ranked concerns from the main path (@amandasystems)
    • rewrite invalid universe constraints with outlives 'static constraints — PR #123720
    • completely remove placeholders — in progress, PR #130227
  • Replace parts of the borrow checker with location-insensitive Polonius
  • Location-sensitive prototype on nightly (@lqd)
    • create structures for location-dependent outlives constraints
    • build new constraint graph from typeck constraints and liveness constraints
    • update NLLs for required changes to local & region liveness, loan liveness & loan scopes, (possibly unreachable) kills, bidirectional traversal & active loans
    • land on nightly — PRs #134268, #134315, #134378, #134670, #134914, #134920, #134980 and #135290.
  • Debugging / dump tool for analysis of location-sensitive analysis (@lqd), PRs #136031, #136104 and #136278
  • Tests and validation of location-sensitive Polonius (@lqd) — in progress
    • limit regressions about diagnostics when using the new constraints on diagnostics tailored to the old constraints — PR #136299
    • make the full test suite pass — in progress
    • do a crater run for assertions and backwards-compatibility
    • expand test suite with tests about the new capabilities
  • Location-sensitive pass on nightly, tested on CI (@lqd)
  • Standard reviews (types Team): thanks to @jackh726 and @matthewjasper for the many reviews.
@lqd

This comment was marked as resolved.

@rust-lang rust-lang locked and limited conversation to collaborators Jul 25, 2024
@nikomatsakis
Copy link
Contributor Author

This issue is intended for status updates only.

For general questions or comments, please contact the owner(s) directly.

@lqd
Copy link
Member

lqd commented Sep 1, 2024

Status update:

  • we've completed the first task of removing universe errors out of the main path and moving higher-ranked concerns out of the borrowck, in @amandasystems' PR #123720. During SCC construction, we now detect these constraints and rewrite them to regular constraints outliving 'static. We have possible improvements in mind that we can do in the future here, but we've also just started trying to remove placeholders, by replacing them with constraints again.

  • we've also made more progress on the location-sensitive analysis, though not as much as we'd like. By now we have a "localized" constraint graph to implement propagation in a flow-sensitive manner, with constraints from typeck and liveness, and have it hooked it up to the existing borrow checking in non-lexical scope and active loans computation. It's still buggy though: there are a couple hundred of UI test failures at the moment, including important cases that don't work yet. Some of these failures are due to the diagnostics subsystem not yet handling flow-sensitive concerns. The next immediate steps are to analyze and fix the test failures, and to help with this we're also expanding the existing NLL logging, MIR dumps, and visualizations.

  • and we've also started work on the item to use more of the polonius model in the borrow-checker, though we're still in the preparation phase to have an overview of what we want, need, and can change here. Currently, the gated location-insensitive analysis is mainly implemented in the borrowck dataflow analyses, but we want to expand this to replace code and data existing in both models, avoid duplicate work, and simplify the implementation. For example, we can either change components internals to do so, or have them generic enough to work with both models: this applies at least to region liveness, region values, SCC values, etc. We also want to do this unification while taking the location-sensitive requirements into account, as well as constraints that existing downstream code can't be impacted or changed too much for now (e.g. diagnostics).

@lqd
Copy link
Member

lqd commented Oct 14, 2024

Key developments:

  • We were able to fix a bunch of test failures due to some off-by-one errors (in both typeck and livenesss constraints) from the old "mid-points" that don't exist anymore.
  • We're still debugging the large number of test failures, and continued the design work on the unified debugging visualization to help with that. We're currently manually doing the tracing work, and after a few run-throughs within the team, should be able to automate it.
  • We've also focused on some examples that we want to accept but do not yet, in particular all the variations of issue #47680. Some adjustments to the active loans computation may be needed, and/or the locations where effects happen (which also relate to the mid-points item above), which was unexpected. It shouldn't be hard to fix though.
  • Amanda has continued cleaning up placeholders in the work-in-progress PR #130227.
  • There was no particular progress on using more of the polonius model within the borrowck,. We also had a bit less time than usual with me going to RustConf, and Amanda going to EuroRust and doing a presentation on polonius 🎉. We'll add a link to the recording once they're made available.

@lqd
Copy link
Member

lqd commented Nov 15, 2024

Key developments, with good progress:

  • I've completed the loan liveness + active loans changes I mentioned last time, with kills stopping traversal in the new localized constraint graph
  • issues with missing liveness edges when regions flowed into universal regions were fixed
  • issues with liveness optimizations for NLLs, that don't apply anymore in a location-sensitive context, were fixed
  • I'm currently double-checking invariant edges on assignments/calls: they should be bidirectional, but some of the edges allowing backwards propagation are missing, causing some test failures
  • I identified an issue in diagnostics causing 80 or so differences in tests, adding to the diagnostics differences caused by the NLL optimization described above: some errors and handcrafted diagnostics are tailored to the current liveness NLL behavior and structure.
  • a lot of work was done on the NLL+polonius data visualization ("debugger"). There's still quite a lot to do here but it already was crucial in diagnosing and fixing the above issues, even in its early state
  • overall around 70 real test failures are left to analyze, and then I'll look at the diagnostics differences to see what can be done. The ICEs there expecting NLL constraints should be fixable. The others may require more creative solutions, e.g. piggybacking on the existing code.
  • we're also still working out how to best land all this work
  • Amanda has done non-trivial work in Remove placeholders completely rust#130227 which now also handles type tests. The PR should be ready to review, and a crater run is currently in-progress.
  • Amanda's EuroRust talk on polonius from last month is also now available on YouTube.

@lqd
Copy link
Member

lqd commented Dec 31, 2024

Here are the key developments for the last update of the year:

  • Amanda has made even more progress on removing placeholders, in particular on lazy constraints and early error reporting, as well as investigating issues with rewriting type tests. A few tests are still failing, and it seems error reporting and diagnostics will be hard to keep exactly as today. Unfortunately, what was devilishly marketed as hoped to be a straightforward refactor turned out to have a lot of hidden complexity
  • I've continued analyzing the test failures and fixing them. We're now at 6 failures of interest that still need investigation, and the rest are diagnostics changes.
  • I've also hit some kind of weakness in our initial approach: our unified naive reachability conflates the two different transitive closures we used to have. That should have been fine, and the problem didn't exist in our datalog implementation, but with backwards edges we can encounter kills when time-traveling (and they can be impactful). Who knew that time-traveling and spacetime reachability could be mind bending. We haven't fully thought things through just yet, but we can still model reachability as a two-step process anyways. The implementation has a conservative approximation for these cases, but it still was surprising. We'll need a bit more analysis.
  • even with the few test failures, we feel the prototype is working well enough that it's worthwhile to land. So I've spent most of the time cleaning it up and opening PRs: Foundations of location-sensitive polonius rust#134268, A couple of polonius fact generation cleanups rust#134315, An octuple of polonius fact generation cleanups rust#134378, Compute liveness constraints in location-sensitive polonius rust#134670, A couple datalog/borrowck cleanups rust#134914 have landed already. Convert typeck constraints in location-sensitive polonius rust#134920 and Location-sensitive polonius prototype: endgame rust#134980 are currently in review.
  • These last 2 actually complete end-to-end testing, and replace the location-insensitive analysis in-tree. With that, basically all of the progress done on the prototype, that we've been reporting here, should be available soon-ish. There's of course still a lot of work left to do, but it's still good a milestone.

(Happy Holidays and Happy New Year everyone!)

@lqd
Copy link
Member

lqd commented Jan 31, 2025

Key developments from this month:

@nikomatsakis nikomatsakis modified the milestones: 2024h2, 2025h1 Feb 18, 2025
@nikomatsakis
Copy link
Contributor Author

This is a continuing project goal, and the updates below this comment will be for the new period 2025h1

@lqd
Copy link
Member

lqd commented Mar 24, 2025

Here are the key developments for this update.

Amanda has continued on the placeholder removal task. In particular on the remaining issues with rewritten type tests. The in-progress work caused incorrect errors to be emitted under the rewrite scheme, and a new strategy to handle these was discussed. This has been implemented in the PR, and seems to work as hoped. So the PR should now be in a state that is ready for more in-depth review pass, and should hopefully land soon.

Tage has started his master's thesis with a focus on the earliest parts of the borrow checking process, in order to experiment with graded borrow-checking, incrementalism, avoiding work that's not needed for loans that are not invalidated, and so on. A lot of great progress has been made on these parts already, and more are being discussed even in the later areas (live and active loans).

I have focused on taking care of the remaining diagnostics and test failures of the location-sensitive analysis. For diagnostics in particular, the PRs mentioned in the previous updates have landed, and I've fixed a handful of NLL spans, all the remaining differences under the compare-mode, and blessed differences that were improvements. For the test failures, handling liveness differently in traversal fixed most of the remaining failures, while a couple are due to the friction with mid-points avoidance scheme. For these, we have a few different paths forward, but with different trade-offs and we'll be discussing and evaluation these in the very near future. Another two are still left to analyze in-depth to see what's going on.

Our near future focus will be to continue down the path to correctness while also expanding test coverage that feels lacking in certain very niche areas, and that we want to improve. At the same time, we'll also work on a figuring out a better architecture to streamline the entire end-to-end process, to allow early outs, avoid work that is not needed, etc.

@lqd
Copy link
Member

lqd commented May 5, 2025

Here are the key developments for the month of April

  • @amandasystems
  • Tage
    • continued experimenting and making progress with the early phase of the process, and making building constraints, and traversing them per loan, lazy
    • started extracting some of that work for discussion, review, PRs, as well as writing reports for his masters thesis
  • @lqd
    • continued on improving the algorithm. We're now at a point where we have an approximation of the datalog algorithm, which handles our UI tests -- except one where control flow in a loop connects to regions that are live before and after the loop: this causes a false positive that our datalog implementation used to accept (via a more comprehensive but slower approach).
    • we're currently discussing whether we can cut scope here, as this formulation accepts NLL problem case 3. We'll need to evaluate what limits this formulation imposes on expressiveness outside NLL problem case 3 and streaming iterators -- and whether it indeed has an easier path to becoming production ready. We'll also still try to see if it's possible to still improve the algorithm and avoid emitting errors on issue 46589, since we initially hoped to fix this one as well.

@lqd
Copy link
Member

lqd commented May 27, 2025

Here are the key developments for May, though there was a bit less time this month due to the All Hands.

@amandasystems: A few more rounds of reviews were done on rust-lang/rust#140466 (thanks to lcnr!), and most, if not all, of the feedback has been addressed already. Another PR was opened as a successor, containing another big chunk of work from the initial PR #130227: rust-lang/rust#140737.

@tage64: The work discussed in the previous updates has been extracted into a few PRs, mostly to do perf runs to be able to gauge the overhead in the in-progress implementation. First, an alternative implementation to rustc's dense bitset, which is used extensively in dataflow analyses such as the ones in the borrow checker, for example. Then, a prototype of the algorithm discussed in prior updates, trying to make the location-sensitive constraints built lazily, as well as the loans in scope themselves. (And the union of these two in #141583)

@lqd: As discussed in the previous update, I've tried to see if we can limit scope here by evaluating the current algorithm a bit more: the expressiveness it allows, and where it fails. I've also evaluated all the open issues about NLL expressiveness that we hoped to fix, and see the ones we support now or could defer to future improvements. It seems possible. I've also started to have some idea of the work needed to make it more production-ready. That includes the experiments made with Tage above, but also trying to lower the total overhead by finding wins in NLLs, and here I e.g. have some improvements in-flight for the dataflow analysis used in liveness.

All Hands: we discussed with t-types the plan and in-progress PRs about opaque types, how they impact member constraints and in turn the constraint graph and SCCs. Some more work is needed here to ensure member constraints are correctly handled, even though they should only impact the SCCs and not the borrow checking algorithm per se (but there still are possible ambiguity issues if we don't take flow sensitivity into account here).

(Fun and interesting aside: there's an RFC to add a polonius-like lifetime analysis to clang)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants