Skip to content

Add a parse recovery in array type syntax #81404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 65 additions & 9 deletions compiler/rustc_parse/src/parser/ty.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,27 @@ pub(super) enum RecoverReturnSign {
No,
}

/// Signals whether to recover from invalid separator in array syntax `[<ty>; <const>]` (the
/// semicolon)
///
/// We don't want this recovery when parsing pre-2018-edition trait method arguments, which don't
/// allow names or patterns in the arguments. Example:
///
/// ```
/// trait T {
/// fn foo([a, b]: [i32; 2]) {}
/// }
/// ```
///
/// Here `[a, b]` is parsed as a type and we need to fail without recovery/errors so that we can
/// check (after type parsing fails) whteher it looks like a pattern, and give a more helpful
/// message if it is. ui/issues/issue-50571.rs tests this.
#[derive(Copy, Clone, PartialEq)]
enum RecoverArrayType {
Yes,
No,
}

impl RecoverReturnSign {
/// [RecoverReturnSign::Yes] allows for recovering `fn foo() => u8` and `fn foo(): u8`,
/// [RecoverReturnSign::OnlyFatArrow] allows for recovering only `fn foo() => u8` (recovering
Expand Down Expand Up @@ -98,6 +119,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::No,
RecoverQPath::Yes,
RecoverReturnSign::Yes,
RecoverArrayType::Yes,
)
}

Expand All @@ -110,6 +132,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::Yes,
RecoverQPath::Yes,
RecoverReturnSign::Yes,
RecoverArrayType::No, // see type documentation
)
}

Expand All @@ -125,6 +148,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::No,
RecoverQPath::Yes,
RecoverReturnSign::Yes,
RecoverArrayType::Yes,
)
}

Expand All @@ -135,6 +159,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::Yes,
RecoverQPath::Yes,
RecoverReturnSign::OnlyFatArrow,
RecoverArrayType::Yes,
)
}

Expand All @@ -152,6 +177,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::No,
recover_qpath,
recover_return_sign,
RecoverArrayType::Yes,
)?;
FnRetTy::Ty(ty)
} else if recover_return_sign.can_recover(&self.token.kind) {
Expand All @@ -171,6 +197,7 @@ impl<'a> Parser<'a> {
AllowCVariadic::No,
recover_qpath,
recover_return_sign,
RecoverArrayType::Yes,
)?;
FnRetTy::Ty(ty)
} else {
Expand All @@ -184,6 +211,7 @@ impl<'a> Parser<'a> {
allow_c_variadic: AllowCVariadic,
recover_qpath: RecoverQPath,
recover_return_sign: RecoverReturnSign,
recover_array_type: RecoverArrayType,
) -> PResult<'a, P<Ty>> {
let allow_qpath_recovery = recover_qpath == RecoverQPath::Yes;
maybe_recover_from_interpolated_ty_qpath!(self, allow_qpath_recovery);
Expand All @@ -199,7 +227,7 @@ impl<'a> Parser<'a> {
} else if self.eat(&token::BinOp(token::Star)) {
self.parse_ty_ptr()?
} else if self.eat(&token::OpenDelim(token::Bracket)) {
self.parse_array_or_slice_ty()?
self.parse_array_or_slice_ty(recover_array_type)?
} else if self.check(&token::BinOp(token::And)) || self.check(&token::AndAnd) {
// Reference
self.expect_and()?;
Expand Down Expand Up @@ -346,7 +374,10 @@ impl<'a> Parser<'a> {

/// Parses an array (`[TYPE; EXPR]`) or slice (`[TYPE]`) type.
/// The opening `[` bracket is already eaten.
fn parse_array_or_slice_ty(&mut self) -> PResult<'a, TyKind> {
fn parse_array_or_slice_ty(
&mut self,
recover_array_type: RecoverArrayType,
) -> PResult<'a, TyKind> {
let elt_ty = match self.parse_ty() {
Ok(ty) => ty,
Err(mut err)
Expand All @@ -360,13 +391,38 @@ impl<'a> Parser<'a> {
}
Err(err) => return Err(err),
};
let ty = if self.eat(&token::Semi) {
TyKind::Array(elt_ty, self.parse_anon_const_expr()?)
} else {
TyKind::Slice(elt_ty)
};
self.expect(&token::CloseDelim(token::Bracket))?;
Ok(ty)

let size_const =
if self.eat(&token::Semi) { Some(self.parse_anon_const_expr()?) } else { None };

match self.expect(&token::CloseDelim(token::Bracket)) {
Comment on lines +395 to +398
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you could rewrite this as a single match on self.token.kind with four arms:

  • ;
  • , if following thing could be const (pretty sure there's a method for that
  • ]
  • _ case, where you call expect_any(&[CloseDelim(Bracket), Semi]) purely for the error.

At that point I think you can entirely avoid the snapshotting (but can still do if you find cases where it is worth it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using a 4-way match here. I just implemented this and pushed.

I'm now using struct_span_errs to generate the errors. Good: The error message is now improved! When the recovery successful I now emit "expected ;, found ," with a help, instead of "expected ; or ], found ,". Bad: the error message "expected ... found ..." will probably diverge from the one expected_one_of_not_found in time. I could call expected_one_of_not_found, but I'd have to add an unreachable()! to Ok case of the return value, which I'm not a fan of..

I think we could move the error string generation of expected_one_of_not_found and reuse it.

, if following thing could be const (pretty sure there's a method for that

I looked for this function but couldn't find it. Assuming this function is similar to is_certainly_not_a_block, I'm also not a fan of this idea. The problem is these functions are usually overly conservative and break very easily, as shown in #82051.

It's not clear to me what the problem is with recovering as currently implemented. If it's simply confusing code or something like that, perhaps we can implement a helper function to encapsulate this pattern (save the parser state, restore parser state on error, continue on success).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could call expected_one_of_not_found, but I'd have to add an unreachable()! to Ok case of the return value, which I'm not a fan of..

I think doing that would be reasonable to do.

Ok(_) => Ok(match size_const {
Some(size_const) => TyKind::Array(elt_ty, size_const),
None => TyKind::Slice(elt_ty),
}),
Err(mut bracket_err)
if recover_array_type == RecoverArrayType::Yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case where we wouldn't want to do the lookahead you're doing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not doing lookahead here, do you mean recovery? See the RecoverArrayType documentation for why we don't want to try to recover always. Also documented in the commit message. In short, it makes some error messages in pre-2018 editions worse.

I think that may be fine as we probably don't have a lot of users today that actively write pre-2018 code (and I'd expect those people to use an older version of rustc, without this patch). If you agree that this is fine, then the RecoverArrayType type and extra parameter can be removed.

&& self.token.kind == token::Comma
&& size_const.is_none() =>
{
// Possibly used ',' instead of ';' in slice type syntax, try to recover
self.bump();
let snapshot = self.clone();
match self.parse_anon_const_expr() {
Ok(array_size) => {
bracket_err.emit();
self.expect(&token::CloseDelim(token::Bracket))?;
Ok(TyKind::Array(elt_ty, array_size))
}
Err(mut const_expr_err) => {
const_expr_err.cancel();
*self = snapshot;
Err(bracket_err)
}
}
}
Err(bracket_err) => Err(bracket_err),
}
}

fn parse_borrowed_pointee(&mut self) -> PResult<'a, TyKind> {
Expand Down
3 changes: 3 additions & 0 deletions src/test/ui/parser/issue-81097.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fn main() {
drop::<[(), 0]>([]); //~ ERROR: expected one of `;` or `]`, found `,`
}
8 changes: 8 additions & 0 deletions src/test/ui/parser/issue-81097.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
error: expected one of `;` or `]`, found `,`
--> $DIR/issue-81097.rs:2:15
|
LL | drop::<[(), 0]>([]);
| ^ expected one of `;` or `]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the nightly release we currently also get the ^ expected one of ';' or ']' error, the first error we currently get is misleading though. But I would argue that a little thought easily solves this initial confusion in all cases. I don't want to sound too negative here and the rust community appreciates your effort, but I feel that the overhead of reading your additional code in the parser is not worth the benefit of having the misleading first error message removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the overhead of reading your additional code in the parser is not worth the benefit of having the misleading first error message removed.

I'd be happy to implement this differently if anyone has any suggestions. I'm quite new here and I don't know the code in detail.

Regarding the extra code, I'd leave that decision to the active maintainers/developers of the parser.

In compiler development this kind of tradeoff exists all the time: you add code for user convenience but it makes maintainers' job more difficult. I don't know how rustc devs decides on what to do in these cases but I'm happy to close the PR if they think this code is not worth it. (I guess in that case the issue should be closed too?)

I made the mistake of using , instead of ; in array types in the past many times, but for me the second message is clear enough and the first one doesn't bother me too much. So I don't feel strongly either way.


error: aborting due to previous error