Try `either` when compare an intersection against an applied type in GADT mode #15175

Linyxus · 2022-05-12T17:21:16Z

This PR fixes #11545 by trying either before approximating LHS with tryBaseType when comparing an intersection type against an applied type in GADT inference mode.

The unsoundness of #11545 comes from the following steps taken by TypeComparer:

// the following comparison happens when typing the pattern matching
==> isSubType Inv[A] & Y <:< Inv[String]?
  ==> isSubType Inv[A] <:< Inv[String] (left is approximated)?
    ==> isSubType String <:< A?  // approx is reset here

In the above trace, when comparing an intersection type Inv[A] & Y against Inv[String], the type comparer approximate LHS to Inv[A] directly and compare Inv[A] <:< Inv[String]. Since the approximation state gets reset when comparing the arguments of applied types, this comparison results in unsound GADT constraints.

In this PR, when comparing S ∧ T <:< F[A], instead of trying to approximate LHS type to the base type F, we will try both S <:< F[A] and T <:< F[A] and merge GADT constraints from the two branches. This fixes the unsoundness brought by approximating the intersection (and forgetting about the approximation afterwards as shown in the trace) by trying to inference GADT constraints in a more precise way.

@abgruszecki

…ADT mode

dwijnand · 2022-06-09T22:27:28Z

Nice, @Linyxus! I wonder what use cases require that approximation state reset, because it seems problematic to the dual use of TypeComparer and preserving GADT soundness...

@LPTK WDYT of this? Is this too narrowly snapping up intersection types before tryBaseType makes mistakes? Or are intersection types indeed the only types that need this?

LPTK · 2022-06-10T04:54:27Z

I agree with @dwijnand that the whole idea of "resetting approximations" feels very fishy. Where else is such resetting potentially done while in GADT mode? These would likely be soundness holes too.

Regarding the PR, I don't like the looks of the fix. Looking at the diff, you're only adding a || tryGadtAnd1 clause to the comparison algorithm in order to change the branch being taken in this specific case. But will that branch be taken reliably in all problematic situations? Taking the branch relies on tp1 syntactically being an AndType. How do we know this is sufficient? If anything, this should be thoroughly explained in a comment, because it may no longer hold after the next refactoring of constraint solving (if it currently does at all).

IMO it would be much more logical to fix the unsoundness in tryBaseType as done in #12087. This could very well be done in addition to what is already proposed in this PR.

dwijnand · 2022-06-10T09:11:03Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

@@ -1198,6 +1198,14 @@ class TypeComparer(@constructorOnly initctx: Context) extends ConstraintHandling
            case _ => false
        } && { GADTused = true; true }

+      def tryGadtAnd1: Boolean =
+        ctx.mode.is(Mode.GadtConstraintInference) && {


Suggested change

ctx.mode.is(Mode.GadtConstraintInference) && {

necessaryConstraintsOnly && {

Perhaps is better?

dwijnand · 2022-06-10T09:21:06Z

Thanks, @LPTK - I see what you're saying. After this use of tryBaseType the one other call is behind a typeParams.isEmpty check, so I think it might mean it's safe? Either way, I think it makes sense to bring tryGadtAnd1 in as a leading check of tryBaseType.

Also, isn't the fall-through to tryBaseType if tryGadtAnd1 fails unsafe for GADT reasoning? If both recur(tp11, tp2) and recur(tp12, tp2) fail, we're going to call tryBaseType.

Linyxus · 2022-06-10T18:05:30Z

Regarding the way to fix this issue, agree that based on the usage of tryBaseType, only modifying this callsite should be safe, but I found that Alex's original fix (#12087) may indeed be a better way to fix the problem. There are two reasons:

The two PRs similarly jump to a different logic when the LHS is an intersection and we are in GADT inference mode. The key difference is that when the leading logic fails, we still call tryBaseType in this PR, but will not run the original logic of tryBaseType in Fix i11545 #12087 anymore. The logic in Fix i11545 #12087 is more reasonable because when neither tp11 <:< tp2 nor tp12 <:< tp2 succeed (tp1 is tp11 & tp12), try tryBaseType should not succeed too. Because the fact that both tp11 <:< tp2 and tp12 <:< tp2 fail means tp11 & tp12 <:< tp2 fails. tryBaseType will find a base type base being the supertype of (or equivalent to) tp11 & tp12, so base <:< tp2 will not succeed too.
Fix i11545 #12087 uses fourthTry and our PR uses either(recur(tp11, tp2), recur(tp12, tp2)) directly. It seems that in fourthTry we have the more comprehensive logic to handle the intersection case.

Maybe we should merge #12087 instead of this one?

Linyxus · 2022-06-10T18:17:34Z

Regarding the soundness holes caused by resetting approx state: yes, we have more soundness holes caused by this. Each time isSubType(S, T) (instead of recur(S, T)) is called inside isSubType, the approx state is reset, and it will be a potential soundness hole. For example, I find another callsite of isSubType here, and we can make a counter-example showing the unsoundness brought by this:

enum SUB[-A, +B]:
  case Refl[C]() extends SUB[C, C]

trait Tag { type T }

def foo[A, B, X <: Tag{type T <: A}](e: SUB[X, Tag{type T <: B}], x: A): B = e match {
  case SUB.Refl() =>
    // unsound GADT constr because of approx state resetting: A <: B
    x
}

def bad(x: Int): String =
  foo[Int, String, Tag{type T = Nothing}](SUB.Refl(), x)  // cast Int to String

In this example, we are trying to extract necessary constraint from X <: Tag{type T <: B}, where X is known to be a subtype of Tag{type T <: A}. During this we will try upcasting LHS and compare Tag{type T <: A} <: Tag{type T <: B} with LHS approximated. Since when comparing the refinement info the approx state is reset, we derive the unsound constraint A <: B.

Linyxus · 2022-06-10T18:33:22Z

I am not sure how the reset of approx state will interact with the type comparison process. But it seems that the situation is very tricky (@abgruszecki and I have discussed this before): SR sometimes relies on the reset of approx state to derive the desired constraint. So solutions like freezing GADT constraints when we reset approx state will not work. Consider the following example:

enum SUB[-A, +B]:
  case Refl[X]() extends SUB[X, X]

trait E[+X]
case class L[A]() extends E[A]

def foo[A, B](e: SUB[L[A], E[B]], x: A): B = e match {
  case SUB.Refl() =>
    x
}

Here we are trying to extract necessary constraint from L[A] <:< E[B], where L is a subclass of E. Ideally, we should derive the constraint A <: B. The TypeComparer will call tryBaseType when checking the subtype relationship L[A] <:< E[B], and approximate LHS to E[A]. The approx state (approx low) is set when we compare E[A] <:< E[B] in the next step. If we do not reset approx state when comparing the applied arguments A <:< B, we will not be able to derive the desired constraint.

Linyxus · 2022-06-10T18:35:49Z

It seems that the approx state is not only unstable (may be reset), but also unprecise for the use of GADT reasoning (it works fine for type inference). For example, E[A] <: E[B] will be the only (necessary) way for L[A] <:< B[A] to hold. So upcasting E[A] to L[A] should not stop us from deriving the constraint. But the approx state will still be set in this case.

dwijnand · 2022-06-11T09:07:04Z

SR sometimes relies on the reset of approx state to derive the desired constraint

SR?

Great thoughts. I think we should experiment with the impact of disabling the reset while inferring GADT constraints (or, more generally, necessaryConstraintsOnly?). It's the only way to be aware of possible unsoundness in GADT reasoning. Correct me if I'm wrong, but I believe we significantly bias to soundness rather than completeness for type inference/checking, so we should do the same for GADT reasoning.

I also think we should make it harder and/or more obvious that the approximation is reset: isSubType(S, T) vs recur(S, T) seems too subtle to me.

abgruszecki · 2022-06-16T19:59:02Z

SR = Subtyping Reconstruction.

recur indicates that we're recursing on a subcomponent of the current type, while isSubType is recursing on a "new" type. I was planning to rewrite TypeComparer so that calling either would result in a frozen GADT constraint, unless a boolean flag was explicitly passed to the call. That way, we could "whitelist" recursion points which we know are sound when reconstructing subtyping / inferring GADT constraint.

Looking at the two PRs once again, I think #12087 might indeed be just better than this approach. The way it's implemented, we go from tryBaseType into fourthTry, which has specialised code for handling AndType. I suggested at one point to @Linyxus that the changes from this PR should be moved to tryBaseType, but I think simply going to fourthTry like I did is still preferrable.

@LPTK: the entire TypeComparer is written in an extremely order-dependent way. We can always rely that aliases etc. are handled very early, and we can rely on the type shapes being handled in a very particular way. Changing that would be fundamentally changing TypeComparer.

ckipp01 · 2023-05-10T07:36:24Z

Hello! Since it's been a while I wanted to check in on this @Linyxus. Do you plan on returning here, or are you waiting for more input, etc?

Linyxus · 2023-05-11T12:54:25Z

Hello @ckipp01! We will wait for a refactorization of GADT (#15928) before proceeding this feature. I will close this for now and go back to it after the refactorization is done.

try LHS and type before tryBaseType when comparing applied types in G…

92aabde

…ADT mode

prolativ requested a review from abgruszecki May 16, 2022 13:23

prolativ assigned abgruszecki May 16, 2022

dwijnand mentioned this pull request Jun 9, 2022

Fix i11545 #12087

Closed

dwijnand reviewed Jun 10, 2022

View reviewed changes

abgruszecki mentioned this pull request Jun 20, 2022

Comparing Refined Types can lead to invalid subtype reconstruction #15485

Open

Linyxus closed this May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try `either` when compare an intersection against an applied type in GADT mode #15175

Try `either` when compare an intersection against an applied type in GADT mode #15175

Linyxus commented May 12, 2022 •

edited

Loading

dwijnand commented Jun 9, 2022 •

edited

Loading

LPTK commented Jun 10, 2022

dwijnand Jun 10, 2022

dwijnand commented Jun 10, 2022

Linyxus commented Jun 10, 2022

Linyxus commented Jun 10, 2022 •

edited

Loading

Linyxus commented Jun 10, 2022

Linyxus commented Jun 10, 2022 •

edited

Loading

dwijnand commented Jun 11, 2022

abgruszecki commented Jun 16, 2022

ckipp01 commented May 10, 2023

Linyxus commented May 11, 2023

	ctx.mode.is(Mode.GadtConstraintInference) && {
	necessaryConstraintsOnly && {

Try either when compare an intersection against an applied type in GADT mode #15175

Try either when compare an intersection against an applied type in GADT mode #15175

Conversation

Linyxus commented May 12, 2022 • edited Loading

dwijnand commented Jun 9, 2022 • edited Loading

LPTK commented Jun 10, 2022

dwijnand Jun 10, 2022

Choose a reason for hiding this comment

dwijnand commented Jun 10, 2022

Linyxus commented Jun 10, 2022

Linyxus commented Jun 10, 2022 • edited Loading

Linyxus commented Jun 10, 2022

Linyxus commented Jun 10, 2022 • edited Loading

dwijnand commented Jun 11, 2022

abgruszecki commented Jun 16, 2022

ckipp01 commented May 10, 2023

Linyxus commented May 11, 2023

Try `either` when compare an intersection against an applied type in GADT mode #15175

Try `either` when compare an intersection against an applied type in GADT mode #15175

Linyxus commented May 12, 2022 •

edited

Loading

dwijnand commented Jun 9, 2022 •

edited

Loading

Linyxus commented Jun 10, 2022 •

edited

Loading

Linyxus commented Jun 10, 2022 •

edited

Loading