-
Notifications
You must be signed in to change notification settings - Fork 13.4k
get rid of some false negatives in rustdoc::broken_intra_doc_links #132748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
get rid of some false negatives in rustdoc::broken_intra_doc_links #132748
Conversation
r? @notriddle rustbot has assigned @notriddle. Use |
df3d0f6
to
8ae6586
Compare
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good idea, but I have found one problem. Consider a sample like this:
//@ check-pass
#![no_std]
#![deny(rustdoc::broken_intra_doc_links)]
// regression test for https://github.com/rust-lang/rust/issues/54191
// false positives
//! This is not an intra-doc link: [`foobar`]
//!
//! [`foobar`]: /index.html
That test case fails, because it thinks /index.html
is an intra-doc link. You're right that certain kinds of links aren't allowed to be URLs—that's the Unknown type links.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
welp, turns out the standard library itself contains several of these false negatives! |
...or perhaps there's some deeper bug with intra-doc link generation? there seems to be some false positives containing spaces, even though there is a matching reference definition... I guess we can just re-enable unconditionally ignoring links with spaces? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Ok, this breaks a lot of ui tests. This might actually have a pretty big impact, maybe we should do a crater run or something..? |
It's looks fine to me. Here's the PRs where each of these test cases were added:
The last two are both contrived ICE tests, and all of them are intended to be parsed as intra-doc links. Making it so that they warn seems fine to me. r? @GuillaumeGomez do you also think this is a suitable bug fix? |
I think so. I'm not completely satisfied with the current impact though. Intra-doc links should not be triggered on items that contain characters which can't be in an ident or a path, like |
@GuillaumeGomez what about just amending the current warning to account for other common mistakes? honestly makes me wish we had regardless, i think that should be a separate issue, since the lint output isn't the most helpful in general (it just recommends escaping the brackets, and doesn't mention other common issues, such as misspelled link references) |
Do you have an example of before/after of what you have in mind by any chance? |
well, the most obvious one is some sort of "consider adding a reference: and perhaps we should also mention surrounding code by backticks, since code snippets should generally be in |
Sounds good to me. The distance between two items could be nice too (as a follow-up?). |
a918960
to
6747e21
Compare
I know @notriddle wanted to actually make the lints more generous in some cases, but the way this is currently implemented is that links that look like urls get completely ignored by the entire intra-doc link system, not just ignored by the lint. so, the best I can do without significant overhauls is to revert to the previous buggy behavior if there are no backticks. |
this is in an effort to reduce the amount of code churn caused by this lint triggering on text that was never meant to be a link. a more principled hierustic for ignoring lints is not possible without extensive changes, due to the lint emitting code being so far away from the link collecting code, and the fact that only the link collecting code has access to details about how the link appears in the unnormalized markdown.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
collapsed links and reference links have a pretty particular syntax, it seems unlikely they would show up on accident.
This comment has been minimized.
This comment has been minimized.
Some changes occurred in tests/rustdoc-json |
513d9b7
to
32a73bf
Compare
…aumeGomez jsondocck: Require command is at start of line In one place we use `///`@`` instead of `//`@`.` The test-runner allowed it, but it probably shouldn't. Ran into by `@lolbinarycat` in rust-lang#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(`@.name=='Foo')].id"` | ^^ | ``` Maybe it's also worth erroring on this like we added in rust-lang#137103 r? `@GuillaumeGomez`
…aumeGomez jsondocck: Require command is at start of line In one place we use `///`@`` instead of `//`@`.` The test-runner allowed it, but it probably shouldn't. Ran into by `@lolbinarycat` in rust-lang#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(`@.name=='Foo')].id"` | ^^ | ``` Maybe it's also worth erroring on this like we added in rust-lang#137103 r? `@GuillaumeGomez`
…aumeGomez jsondocck: Require command is at start of line In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"`` | ^^ | ``` Maybe it's also worth erroring on this like we added in rust-lang#137103 r? ``@GuillaumeGomez``
Rollup merge of rust-lang#140076 - aDotInTheVoid:jsondocline, r=GuillaumeGomez jsondocck: Require command is at start of line In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"`` | ^^ | ``` Maybe it's also worth erroring on this like we added in rust-lang#137103 r? ``@GuillaumeGomez``
jsondocck: Require command is at start of line In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang/rust#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"`` | ^^ | ``` Maybe it's also worth erroring on this like we added in #137103 r? ``@GuillaumeGomez``
jsondocck: Require command is at start of line In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang/rust#132748 (comment): ``` error: unknown disambiguator `?(` ##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25 | 3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"`` | ^^ | ``` Maybe it's also worth erroring on this like we added in #137103 r? ``@GuillaumeGomez``
I'm quite happy with the current state of this, and think that the vast majority of new warnings created by it will be genuine mistakes. Just waiting on code review now. |
// |-------------------------------------------------------| | ||
// | | is shortcut link | not shortcut link | | ||
// |--------------|--------------------|-------------------| | ||
// | has backtick | never ignore | never ignore | | ||
// | no backtick | ignore if url-like | never ignore | | ||
// |-------------------------------------------------------| | ||
let ignore_urllike = | ||
can_be_url || (ori_link.kind == LinkType::ShortcutUnknown && !ori_link.link.contains('`')); | ||
if ignore_urllike && should_ignore_link(path_str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the complete truth table is:
unknown shortcut link | unknown reference / collapsed link | other link | |
---|---|---|---|
has backtick | never ignore | never ignore | ignore if url-like |
no backtick | ignore if url-like | never ignore | ignore if url-like |
Assuming I'm understanding the logic correctly (the difference is to things like [first](second)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. Technically I do explicitly define the scope of the existing table with the line "here's a truth table for how link kinds that cannot be urls are handled:"
the line is currently 66 chars long, i should be able to fit in the other case without upsetting tidy (100 char limit), do you want me to?
rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url".
however, only inline links (text) can actually contain a url, other types of links (reference links, shortcut links) contain a reference which is later resolved to an actual url.
the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url.
fixes #54191