Skip to content

Try either when compare an intersection against an applied type in GADT mode #15175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Linyxus
Copy link
Contributor

@Linyxus Linyxus commented May 12, 2022

This PR fixes #11545 by trying either before approximating LHS with tryBaseType when comparing an intersection type against an applied type in GADT inference mode.

The unsoundness of #11545 comes from the following steps taken by TypeComparer:

// the following comparison happens when typing the pattern matching
==> isSubType Inv[A] & Y <:< Inv[String]?
  ==> isSubType Inv[A] <:< Inv[String] (left is approximated)?
    ==> isSubType String <:< A?  // approx is reset here

In the above trace, when comparing an intersection type Inv[A] & Y against Inv[String], the type comparer approximate LHS to Inv[A] directly and compare Inv[A] <:< Inv[String]. Since the approximation state gets reset when comparing the arguments of applied types, this comparison results in unsound GADT constraints.

In this PR, when comparing S ∧ T <:< F[A], instead of trying to approximate LHS type to the base type F, we will try both S <:< F[A] and T <:< F[A] and merge GADT constraints from the two branches. This fixes the unsoundness brought by approximating the intersection (and forgetting about the approximation afterwards as shown in the trace) by trying to inference GADT constraints in a more precise way.

@abgruszecki

@prolativ prolativ requested a review from abgruszecki May 16, 2022 13:23
@dwijnand dwijnand mentioned this pull request Jun 9, 2022
@dwijnand
Copy link
Member

dwijnand commented Jun 9, 2022

Nice, @Linyxus! I wonder what use cases require that approximation state reset, because it seems problematic to the dual use of TypeComparer and preserving GADT soundness...

@LPTK WDYT of this? Is this too narrowly snapping up intersection types before tryBaseType makes mistakes? Or are intersection types indeed the only types that need this?

@LPTK
Copy link
Contributor

LPTK commented Jun 10, 2022

I agree with @dwijnand that the whole idea of "resetting approximations" feels very fishy. Where else is such resetting potentially done while in GADT mode? These would likely be soundness holes too.

Regarding the PR, I don't like the looks of the fix. Looking at the diff, you're only adding a || tryGadtAnd1 clause to the comparison algorithm in order to change the branch being taken in this specific case. But will that branch be taken reliably in all problematic situations? Taking the branch relies on tp1 syntactically being an AndType. How do we know this is sufficient? If anything, this should be thoroughly explained in a comment, because it may no longer hold after the next refactoring of constraint solving (if it currently does at all).

IMO it would be much more logical to fix the unsoundness in tryBaseType as done in #12087. This could very well be done in addition to what is already proposed in this PR.

@@ -1198,6 +1198,14 @@ class TypeComparer(@constructorOnly initctx: Context) extends ConstraintHandling
case _ => false
} && { GADTused = true; true }

def tryGadtAnd1: Boolean =
ctx.mode.is(Mode.GadtConstraintInference) && {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ctx.mode.is(Mode.GadtConstraintInference) && {
necessaryConstraintsOnly && {

Perhaps is better?

@dwijnand
Copy link
Member

Thanks, @LPTK - I see what you're saying. After this use of tryBaseType the one other call is behind a typeParams.isEmpty check, so I think it might mean it's safe? Either way, I think it makes sense to bring tryGadtAnd1 in as a leading check of tryBaseType.

Also, isn't the fall-through to tryBaseType if tryGadtAnd1 fails unsafe for GADT reasoning? If both recur(tp11, tp2) and recur(tp12, tp2) fail, we're going to call tryBaseType.

@Linyxus
Copy link
Contributor Author

Linyxus commented Jun 10, 2022

Regarding the way to fix this issue, agree that based on the usage of tryBaseType, only modifying this callsite should be safe, but I found that Alex's original fix (#12087) may indeed be a better way to fix the problem. There are two reasons:

  1. The two PRs similarly jump to a different logic when the LHS is an intersection and we are in GADT inference mode. The key difference is that when the leading logic fails, we still call tryBaseType in this PR, but will not run the original logic of tryBaseType in Fix i11545 #12087 anymore. The logic in Fix i11545 #12087 is more reasonable because when neither tp11 <:< tp2 nor tp12 <:< tp2 succeed (tp1 is tp11 & tp12), try tryBaseType should not succeed too. Because the fact that both tp11 <:< tp2 and tp12 <:< tp2 fail means tp11 & tp12 <:< tp2 fails. tryBaseType will find a base type base being the supertype of (or equivalent to) tp11 & tp12, so base <:< tp2 will not succeed too.

  2. Fix i11545 #12087 uses fourthTry and our PR uses either(recur(tp11, tp2), recur(tp12, tp2)) directly. It seems that in fourthTry we have the more comprehensive logic to handle the intersection case.

Maybe we should merge #12087 instead of this one?

@Linyxus
Copy link
Contributor Author

Linyxus commented Jun 10, 2022

Regarding the soundness holes caused by resetting approx state: yes, we have more soundness holes caused by this. Each time isSubType(S, T) (instead of recur(S, T)) is called inside isSubType, the approx state is reset, and it will be a potential soundness hole. For example, I find another callsite of isSubType here, and we can make a counter-example showing the unsoundness brought by this:

enum SUB[-A, +B]:
  case Refl[C]() extends SUB[C, C]

trait Tag { type T }

def foo[A, B, X <: Tag{type T <: A}](e: SUB[X, Tag{type T <: B}], x: A): B = e match {
  case SUB.Refl() =>
    // unsound GADT constr because of approx state resetting: A <: B
    x
}

def bad(x: Int): String =
  foo[Int, String, Tag{type T = Nothing}](SUB.Refl(), x)  // cast Int to String

In this example, we are trying to extract necessary constraint from X <: Tag{type T <: B}, where X is known to be a subtype of Tag{type T <: A}. During this we will try upcasting LHS and compare Tag{type T <: A} <: Tag{type T <: B} with LHS approximated. Since when comparing the refinement info the approx state is reset, we derive the unsound constraint A <: B.

@Linyxus
Copy link
Contributor Author

Linyxus commented Jun 10, 2022

I am not sure how the reset of approx state will interact with the type comparison process. But it seems that the situation is very tricky (@abgruszecki and I have discussed this before): SR sometimes relies on the reset of approx state to derive the desired constraint. So solutions like freezing GADT constraints when we reset approx state will not work. Consider the following example:

enum SUB[-A, +B]:
  case Refl[X]() extends SUB[X, X]

trait E[+X]
case class L[A]() extends E[A]

def foo[A, B](e: SUB[L[A], E[B]], x: A): B = e match {
  case SUB.Refl() =>
    x
}

Here we are trying to extract necessary constraint from L[A] <:< E[B], where L is a subclass of E. Ideally, we should derive the constraint A <: B. The TypeComparer will call tryBaseType when checking the subtype relationship L[A] <:< E[B], and approximate LHS to E[A]. The approx state (approx low) is set when we compare E[A] <:< E[B] in the next step. If we do not reset approx state when comparing the applied arguments A <:< B, we will not be able to derive the desired constraint.

@Linyxus
Copy link
Contributor Author

Linyxus commented Jun 10, 2022

It seems that the approx state is not only unstable (may be reset), but also unprecise for the use of GADT reasoning (it works fine for type inference). For example, E[A] <: E[B] will be the only (necessary) way for L[A] <:< B[A] to hold. So upcasting E[A] to L[A] should not stop us from deriving the constraint. But the approx state will still be set in this case.

@dwijnand
Copy link
Member

SR sometimes relies on the reset of approx state to derive the desired constraint

SR?

Great thoughts. I think we should experiment with the impact of disabling the reset while inferring GADT constraints (or, more generally, necessaryConstraintsOnly?). It's the only way to be aware of possible unsoundness in GADT reasoning. Correct me if I'm wrong, but I believe we significantly bias to soundness rather than completeness for type inference/checking, so we should do the same for GADT reasoning.

I also think we should make it harder and/or more obvious that the approximation is reset: isSubType(S, T) vs recur(S, T) seems too subtle to me.

@abgruszecki
Copy link
Contributor

SR = Subtyping Reconstruction.

recur indicates that we're recursing on a subcomponent of the current type, while isSubType is recursing on a "new" type. I was planning to rewrite TypeComparer so that calling either would result in a frozen GADT constraint, unless a boolean flag was explicitly passed to the call. That way, we could "whitelist" recursion points which we know are sound when reconstructing subtyping / inferring GADT constraint.

Looking at the two PRs once again, I think #12087 might indeed be just better than this approach. The way it's implemented, we go from tryBaseType into fourthTry, which has specialised code for handling AndType. I suggested at one point to @Linyxus that the changes from this PR should be moved to tryBaseType, but I think simply going to fourthTry like I did is still preferrable.

@LPTK: the entire TypeComparer is written in an extremely order-dependent way. We can always rely that aliases etc. are handled very early, and we can rely on the type shapes being handled in a very particular way. Changing that would be fundamentally changing TypeComparer.

@ckipp01
Copy link
Member

ckipp01 commented May 10, 2023

Hello! Since it's been a while I wanted to check in on this @Linyxus. Do you plan on returning here, or are you waiting for more input, etc?

@Linyxus
Copy link
Contributor Author

Linyxus commented May 11, 2023

Hello @ckipp01! We will wait for a refactorization of GADT (#15928) before proceeding this feature. I will close this for now and go back to it after the refactorization is done.

@Linyxus Linyxus closed this May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overconstrained GADT bounds in presence of bad bounds
5 participants