-
Notifications
You must be signed in to change notification settings - Fork 30
Coordination for structured error messages #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,237 @@ | ||
## Introduction | ||
|
||
This proposal seeks technical coordination from the Haskell Foundation | ||
for improving the interop story between GHC and HLS. Once that is | ||
done well, we might imagine taking our gained knowledge to improve | ||
other interop around error messages. | ||
While much of the work may be doable by volunteers, the HF would play | ||
a role in harnessing and corralling the volunteers, as well as coordinating | ||
common APIs between tools that are both easy to implement and easy to use. | ||
The HF may also be instrumental in managing an error code namespace, shared | ||
among all tooling central to Haskell. | ||
|
||
## Background | ||
|
||
Currently, there is no discipline around error messages. This lack of | ||
structure manifests itself in a number of ways: | ||
|
||
- HLS must parse the error messages that GHC | ||
produces. This is fragile, wasteful, and hard to keep up-to-date. For | ||
example, the HLS looks to see if a GHC extension name appears in an error | ||
message, in order to allow the user to automatically enable it via a pragma. | ||
But since `KindSignatures` is a substring of `StandaloneKindSignatures`, any | ||
message mentioning the latter causes HLS to suggest both enabling `KindSignatures` | ||
and `StandaloneKindSignatures` -- even though only `StandaloneKindSignatures` | ||
would actually work. While there is a workaround here, we can see that | ||
better communication between GHC and HLS would avoid this class of problem. | ||
|
||
- Many GHC error messages refer to advanced concepts. This is unavoidable, | ||
as Haskell has advanced features. However, telling a user that their rigid | ||
type variable does not unify with a type because there is a kind mismatch | ||
is utterly bewildering to Haskell learners. Applying structure to error messages | ||
would allow for the creation of an error-message index that could explain | ||
what the messages mean -- and how to fix the errors. | ||
|
||
- Given that tools are increasingly working with one another and invoking | ||
one another, it can be hard to know who exactly is producing an error | ||
message. In one recent example that happened to me, I was trying to get | ||
GHC to work with a GHC plugin, and I got a baffling error. It took me | ||
more than an hour, if I recall, to discover that the problem was with | ||
Haddock (I forget the details) and that I just needed to `--disable-documentation`. | ||
|
||
There is already work in this area, within GHC. For the past few years, GHC | ||
has slowly been converting its error messages to be encoded in data constructors, | ||
not just as (fancy) strings. [This wiki page](https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values) | ||
and [this blog post](https://well-typed.com/blog/2021/08/the-new-ghc-diagnostic-infrastructure/) describe | ||
roughly the state of play. However, this work currently lacks a very important | ||
ingredient: clients. That is, if GHC is exporting new, fancy datatypes encoding | ||
its error messages, are these datatypes of use to HLS? We've reached out | ||
to potential clients for feedback, but the best response we've gotten is something | ||
along the lines of "sure, looks good". That's encouraging, but I would want to have | ||
a little more coordination to make sure that the interface GHC is building is one | ||
that can be easily consumed. The HF could help here by coordinating this | ||
communication between projects. | ||
|
||
The website describing error messages and error-generator identification | ||
are both fresh in this proposal. | ||
|
||
## Motivation | ||
|
||
- It is better for tools to collaborate by passing structured data than | ||
by sending strings back and forth. Structured error messages will thus | ||
accelerate the development of powerful editor integrations and other | ||
code analysis tools. | ||
|
||
- Establishing a website describing error messages will make it a standard | ||
reference in the Haskell community and flatten the learning curve to | ||
new Haskellers. | ||
|
||
While the two main goals of this proposal (conversion of all error messages | ||
to use datatypes; assigning error codes / creating a website) could be | ||
considered separately, I think they make sense together in this proposal. | ||
The second goal depends on the first, and it seems likely that many of | ||
the same potential volunteers will be interested in both. That said, it would | ||
be fine for, e.g. the HFTT to accept only one part of this proposal without | ||
the other, or simply not to commit resources until a mid-way review were conducted. | ||
|
||
## Goals | ||
|
||
1. When compiling a program, HLS queries GHC for error messages and receives | ||
structured errors, not strings. HLS can then use the information in these structures | ||
to offer repairs to the user or other options. | ||
|
||
1. All GHC error messages include a code. These codes can be searched for on a website | ||
that explains the error message, with examples of what causes it and how the error | ||
might be fixed. | ||
|
||
1. Stretch goal: Building on the success of the HLS/GHC integration around error | ||
messages, other central tooling adopts a similar approach. This would, for example, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The HLS/GHC integration is different in kind to, say HLS's integration with other tools. Many of the most difficult ones (mostly the build tools) are invoked as binaries, which makes passing structured errors more difficult. Of course, this could be overcome (many tools these days have a |
||
enable the possibility that HLS can report more informative configuration errors | ||
to users, or even to repair some of the problems itself. | ||
|
||
1. Stretch goal: The HF would establish a global namespace for Haskell-tool | ||
error message codes, where each tool is assigned (say) a prefix it should use | ||
for any error codes. The HF would then encourage tools to use these prefixes | ||
in error codes when presenting messages. | ||
|
||
## What the Haskell Foundation Would Do | ||
|
||
This section is meant to be suggestive of the concrete activity that would support | ||
this proposal. It is possible the HFTT or other HF people would have an alternative | ||
approach, which is fine, too. | ||
|
||
1. Devote the time of an HF person, hereby called the Coordinator, to stay on top of | ||
this project. The Coordinator could be an HF employee, an in-kind donation of labor, | ||
or perhaps a dedicated and trustworthy volunteer. | ||
I think it would be reasonable to timebox this work at 5 hours / week from | ||
the Coordinator. | ||
|
||
1. A key task of the Coordinator is to source volunteers to help with this initiative. | ||
Accordingly, the Coordinator would be responsible for publicity around this plan, as well | ||
as thinking creatively about ways to attract volunteers. For example, it might be a fun | ||
idea to plan a virtual hackathon with potential volunteers or to reward contributions | ||
with t-shirts. I would expect the Coordinator to think creatively about how to source | ||
the volunteers. Volunteer management is a primary requirement of the Coordinator; it is | ||
assumed that the Coordinator is managing volunteers in parallel with all other tasks here. | ||
|
||
1. The Coordinator would start by getting an exact handle on the state of structured | ||
error messages in GHC, by working with current contributors (e.g. Alfredo di Napoli, Sam | ||
Derbyshire, Richard Eisenberg) and looking at the GHC source code. The Coordinator | ||
would then identify an area within GHC that would be an appropriate next step to add | ||
similar structured error messages and source volunteers to contribute to that area. | ||
The Coordinator would help to shepherd any GHC MRs that would arise as part of this work. | ||
|
||
1. In parallel with the previous item, the Coordinator would work with representatives | ||
from the HLS team to figure out how HLS might take advantage of the structured error messages | ||
GHC already has. Even if HLS is not ready to merge yet, the Coordinator and HLS would | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suspect that using the new structured errors will be straightforward. What will probably be non-straightforward and unpleasant is managing the compatibility between versions of GHC that don't have structured errors and those that do. Could be a good GSOC project. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's a good point. But maybe some design work within GHC will avoid this problem? I'm not sure. And we need to make sure that the errors are delivered in the right way. Maybe part of the reason we didn't get much feedback in the past is that it's obviously going to be easy. |
||
work out a way to build a proof-of-concept based on the structured errors GHC already | ||
has. This would validate the current API and increase the confidence in building on it. | ||
|
||
1. Having established that the API is usable, the Coordinator would systematically work | ||
through remaining error messages in GHC, directing volunteers to convert them to the | ||
structured format. | ||
|
||
1. As capacity is available, the Coordinator would also organize (or encourage a volunteer | ||
to organize) a website where error messages could be explained. This might be a wiki, | ||
or a git repository, or something exportable to e.g. readthedocs.io. This might even be | ||
incorporated into the user manual. Figuring out a good | ||
format would be the responsibility of the Coordinator, possibly by contacting stakeholders | ||
with a survey or looking at other language communities. | ||
|
||
1. The Coordinator would devise a scheme for assigning error code to messages. These might | ||
be terse, inscrutable alphanumeric identifiers, or perhaps they would be human-readable. | ||
The namespace would include the possibility of covering tools beyond just GHC, though | ||
recursive hierarchy seems likely unnecessary. With the help of volunteers, the Coordinator | ||
would add these error codes into the error-message API. | ||
|
||
1. The Coordinator would continue to encourage volunteers to document error messages on | ||
the error-message website, learning from early successes and failures. | ||
|
||
1. If the project is going well and with community support, the Coordinator could look at | ||
extending this idea to other tools. For example, perhaps Cabal or Stack could start to | ||
deliver similar structured error messages -- with buy-in from those maintainers, of course. | ||
|
||
## People | ||
|
||
- **Performers:** The Coordinator, someone who will have time dedicated to this project. This person | ||
would ideally be an HF employee or part of the portfolio of an in-kind donation of labor. | ||
|
||
- **Reviewers:** The GHC team would review changes to GHC, while the HLS team would review changes to HLS. | ||
The GHC and HLS teams would work together, coordinated by the Coordinator, to make an API that is useful | ||
to both. Community volunteers would review the text of the website describing error messages. The Coordinator | ||
would review the uptake of any website by examining analytics. | ||
|
||
- **Stakeholders:** This would affect anyone who uses GHC, as the error codes would appear there. Key stakeholders | ||
include the GHC and HLS maintainers, as well as educators, who would have access to Haskell learners who | ||
would benefit from the results of this work. | ||
|
||
## Resources | ||
|
||
- The Coordinator would need to devote 5 hours / week. | ||
- There would be a set of volunteers who would do much of the labor. If the volunteer pool runs low, the Coordinator | ||
can do some of the technical work, as well. | ||
- The GHC and HLS teams would have to devote some of their time to help support this initiative. | ||
|
||
## Timeline | ||
|
||
The timeline is highly dependent on the availability of volunteers to do the work. It thus seems | ||
more sensible to timebox this effort at 5 hours of Coordination / week than to set a deadline for | ||
completion. It would be sensible to review progress after 3 months to decide whether this project | ||
is producing benefits (or is likely to soon). | ||
|
||
## Lifecycle: | ||
|
||
I don't think this really applies here. There would be a warm-up period at the beginning where the goal | ||
is to source volunteers, but afterwards, it's all about keeping people moving forwards. | ||
|
||
## Deliverables | ||
|
||
1. A release of GHC where all of its error messages are structured. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is happening incrementally, right? Which version of GHC do they start to appear in? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 9.4 will be the first. |
||
|
||
1. A release of HLS which consumes the structured error messages of GHC. | ||
|
||
1. A website explaining at least 20 different errors produced by GHC. (More is better!) | ||
I think we should set a modest goal of having 100 unique visitors to this website | ||
over the course of a month. | ||
|
||
1. A blog post (ideally written by the Coordinator) describing this process, as a way | ||
of creating publicity for the HF. | ||
|
||
## Outcomes | ||
|
||
- With the structured interface to errors, tools such as HLS will be better equipped to | ||
offer more power to users to manipulate and reason about code. | ||
|
||
- The error-message cataloguing website will help Haskell learners (and, likely, some | ||
old hands) understand error messages better. | ||
|
||
## Risks | ||
|
||
- It is currently unclear who would best serve as the Coordinator, which is why this | ||
proposal leaves this role abstract. Accordingly, a risk is that there is no one suitable. | ||
However, I still believe this proposal is worth considering (and perhaps approving) in this | ||
state: it would then serve as a concrete task the HF could have when an appropriate | ||
Coordinator arises. In the meantime, it could be used as an idea to show potential sponsors | ||
who might want to know what initiatives the HF is considering or to use as part of a motivation | ||
for expanding the HF employment base. | ||
|
||
- Much of the work in this proposal is designed to be done by volunteers, working in parallel. | ||
It is possible we will not find the right volunteers for this work. It is then possible | ||
for the Coordinator to do more work themselves. In any case, trying to source volunteers for | ||
this work could be an important learning experience in the lifetime of the HF, and it informs | ||
the design of future initiatives. | ||
|
||
- One risk is that the API being built around error messages is not useful to consumers. | ||
This risk is intended to be mitigated by an early consultation with HLS. | ||
|
||
- It is possible that the structured error messages will provide no opportunity for | ||
improvement over the status quo. This is a risk the HFTT should consider. It might also | ||
be worthwhile to reach out to HLS now to see what they think. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh no, it will be so much better than the awful error message parsing that's going on now. It's buggy, painful, unnecessarily complicated code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. But having not heard voices from HLS say this loudly, I didn't want to be over-presumptuous. |
||
|
||
- It is possible that no one will find their way to the error-index website, or that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another reason why the User's Guide is a good location. If we don't think users find it easy to find the User's Guide, then fixing that is great anyway! |
||
the format chosen for the site will not resonate with users. The Coordinator would ideally | ||
reach out to users to understand their needs better as the website is being designed | ||
in order to mitigate this risk. | ||
|
||
- It is possible that the extra structure will provide an obstacle to evolution within | ||
GHC and slow development down there. I do not think this is likely, but it is conceivable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GHC has a nice website of documentation for users: the User's Guide. It's versioned, built nicely from source, and the source lives alongside GHC where the error definitions go. For that reason I think it would be a good proof-of-concept/low-hanging-fruit to start with a directory of GHC's error messages in the User's Guide. If that went well, we could think about how to re-present the same content in another place if that seemed useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are pluses and minuses to using the user manual. (I've added a note later in the document that the manual might be a good place.)
Pluses:
Minuses:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought the source of truth would belong in a code repository -- either GHC itself or a dedicated repo. But of course it would be ideal if we could run a tool to extract docs for inclusion in the GHC manaual, which could be incorporated into the GHC manual build pipeline.
Would this not answer to everyone's needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may be agreeing. GHC's manual is in its code repository -- and in a format that can be extracted to the GHC manual website -- so that conforms to your suggestion. Or a separate website would have its content generated from some source repository ("a dedicated repo"), so that also conforms. But I think maybe there is something deeper you're suggesting, but I've missed. Have I captured your intent here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought as much -- looks like we are all in violent agreement.