Skip to content

WIP: Verify links in docs #17163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

WIP: Verify links in docs #17163

wants to merge 4 commits into from

Conversation

Dedelweiss
Copy link
Contributor

@Dedelweiss Dedelweiss commented Mar 28, 2023

Here is my proposed PR to check that the links in the API comments (@see) are valid and point to valid static documentation.

Validations:

Def ProcessLink:

This function allows to check both that a string is a valid url with the Try, but also that this link leads to an existing static documentation.
When one of these two conditions is not met, a warning is sent when the documentation is generated.

Result:

Screenshot 2023-03-29 at 12 01 09

Disable Validation:

First method

A yaml in a folder, with each file and sub-folder having a warning validation disabled.

  • Take the path where a method is started
  • Verify if a validation.yaml is present in the folder or the previous folder with a line like:
    • noValidation: true
  • if yes, disable the warning.

Second method

A yaml with a folder list, all the files in each folder have warning validation disabled

  • Search for the generalValidation.yaml file in a specific folder (Like ressource)
  • Take the name of each folder in this file
  • During the validation @see, if the name of the folder of the file correspond to one of the name in the file, deactivate the warning

Fixes: #16229

@Dedelweiss
Copy link
Contributor Author

Dedelweiss commented Mar 28, 2023

I also wanted to add a check for external urls with the following method:

  val connection = url.openConnection.asInstanceOf[HttpURLConnection]
  connection.setRequestMethod("GET")
  connection.connect()
  val responseCode = connection.getResponseCode
  connection.disconnect()
if responseCode == 404 then
   report.warn....

This verification is quite important because to check that the url corresponds to a static documentation, I check that it contains /docs/ (which is maybe to be done differently). But this can correspond to any other external documentation link such as https://docs.oracle.com/javase/8/docs/api/java/lang/FunctionalInterface.html and this puts a warning.

But with this method, the time to generate the documentation is much more increased, so I found it not very efficient to add it. If you have other methods I am interested.

@Dedelweiss Dedelweiss changed the title Verify links in docs WIP: Verify links in docs Mar 28, 2023
@Dedelweiss Dedelweiss marked this pull request as draft March 28, 2023 14:26
@Dedelweiss Dedelweiss changed the title WIP: Verify links in docs Fix: Verify links in docs Mar 29, 2023
@Dedelweiss Dedelweiss marked this pull request as ready for review March 29, 2023 09:55
Copy link
Contributor

@Florian3k Florian3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation currently generates a lot of false-positives (over 2000) most of which look like this:

Link to foreach is not valid
Link to flatMap is not valid
Link to toLeft is not valid
Link to `java.lang.ref.Reference` is not valid
Link to scala.collection.Iterator, method `grouped` is not valid
Link to scala.math.Ordering is not valid
Link to scala.math.Ordering is not valid
Link to scala.collection.Iterator, method `sliding` is not valid

It also generates warnings for valid links, such as:

Link to https://dotty.epfl.ch/docs/reference/changed-features/numeric-literals will return a 404 not found

@Dedelweiss
Copy link
Contributor Author

Dedelweiss commented Apr 26, 2023

Thank you very much for your feedback @Florian3k.
I have a quick question for the reference like @see isFunctionType. Are they supposed to be clickable or is it just a non-clickable reference?

@Florian3k
Copy link
Contributor

No, I don't think they're supposed to be clickable.

@Dedelweiss
Copy link
Contributor Author

No, I don't think they're supposed to be clickable.

Okay, so don't need to test them I think. I can just test for internal and external (When I will find a way) link.

@julienrf
Copy link
Contributor

julienrf commented May 1, 2023

I have a quick question for the reference like @see isFunctionType. Are they supposed to be clickable or is it just a non-clickable reference?

They are not supposed to be clickable. Only links formatted as such [[...]] are expected to be clickable.

I think scaladoc should only check internal links, not all the links. And there should be an option to exclude some paths. See how this is done in Laika, for instance.

@Dedelweiss
Copy link
Contributor Author

Hi, I looked at how Laika did it to disable the warnings and I noticed two different methods:

First method

A yaml in a folder, with each file and sub-folder having a warning validation disabled.

  • Take the path where a method is started
  • Verify if a validation.yaml is present in the folder or the previous folder with a line like:
    • noValidation: true
  • if yes, disable the warning.

Second method

A yaml with a folder list, all the files in each folder have warning validation disabled

  • Search for the generalValidation.yaml file in a specific folder (Like ressource)
  • Take the name of each folder in this file
  • During the validation @see, if the name of the folder of the file correspond to one of the name in the file, deactivate the warning

So I plan to do both methods, I think I can do a lot of it. But I'm stuck for the beginning, I don't know how to get the path of a file containing the @see in order to know in which folder it is located.
If someone has an idea, I am very interested. Thank you very much.

@julienrf
Copy link
Contributor

Hey @Dedelweiss, thank you very much for the investigation. I keep thinking that only links within double brackets should be checked, not the content of the @see directive. What do you think?

@Dedelweiss
Copy link
Contributor Author

Hey @Dedelweiss, thank you very much for the investigation. I keep thinking that only links within double brackets should be checked, not the content of the @see directive. What do you think?

Oh yes, I'm sorry, I did misread your previous comment. Indeed it would be better to target more generally the double brackets links in order to test a maximum of them. I'll correct that. Thanks a lot for the feedback.

@Dedelweiss
Copy link
Contributor Author

Dedelweiss commented May 12, 2023

Edited: Indeed, there is a link I can use but the other PR is about double brackets to the API. And this one is about the links in the comments of the codes. So these two RPs are still different.

Hey @julienrf, looking into it, I noticed that it came back to this PR#17099 I did earlier.
The problem is that this one is blocked because I'm waiting for a response on the related issue.
Thank you very much Julien.

@julienrf
Copy link
Contributor

@Dedelweiss do you mind elaborating on what is blocking you? I guess for now we can already check links to static pages and assets (which use the regular link and image markup: [foo](./foo.md) and ![bar.png](/assets/bar.png)), and links to the API documentation (which use the double bracket syntax [[com.example.Foo]]).

What do you think?

@Dedelweiss
Copy link
Contributor Author

Dedelweiss commented May 15, 2023

@Dedelweiss do you mind elaborating on what is blocking you?

I was more stuck on the other PR/Issue with the outcome and question in it. But you have cleared up my doubts very well, thank you very much @julienrf.

What do you think?

Indeed, I think it's interesting to do it like that, so I'll go with that. Thanks for the feedback.

- Add a check of valid url and internal link
- Add condition flag for the warning
@Dedelweiss Dedelweiss force-pushed the check_404 branch 3 times, most recently from 8dd0f8f to 8ddb272 Compare May 16, 2023 15:25
@Dedelweiss Dedelweiss changed the title Fix: Verify links in docs WIP: Verify links in docs May 17, 2023
@Dedelweiss Dedelweiss marked this pull request as draft May 17, 2023 12:32
@Dedelweiss
Copy link
Contributor Author

Dedelweiss commented May 22, 2023

Hello @julienrf,
I have a quick question, In your examples you put [foo](./foo.md) for a link to a static page from the API documentation. I know this kind of link work from a static page to a static page. However, in my testing, I was unable to link from the API documentation in the same way. I think this feature comes from the driForLink function and it might explain why it doesn't work in the API documentation.
So I don't know if it's just my link that's not good or if it's really that reason. I'm interested if you have an example that works or any information.

- Test only the assets and Api link
@julienrf
Copy link
Contributor

In my opinion, it is very important that “API links” (ie, links to symbols of the API usually written with the wikidoc syntax like [[scala.collection.Seq]]) work within the API documentation, and that “page links” (ie, links to pages of the static website like [foo](./foo.md)) work within the static pages.

Once that is done, it would be really great to support both types of links everywhere (so, including API links in the static pages and pages links in the API docs).

All that could be done in separate PRs (especially if the “nice to have” one requires significant development).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scaladoc: possibility to verify links in docs
3 participants