Skip to content

Plugin extensions to enable robust checking for unawaited coroutines in Trio programs #5650

Open
@njsmith

Description

@njsmith

Forking this off from #2499, as requested by Jukka. The discussion is about how Trio programs can use mypy to detect when they've accidentally forgotten an await, which is a fairly common problem. (See also the corresponding Trio bug, which has examples: python-trio/trio#671)

What makes Trio special here is that by design, it's extremely simple for a static analysis tool to detect when an await is missing: when you see a function that returns an Awaitable, then you should always see an await immediately there to consume the Awaitable; if you don't, that's a mistake. This is a much simpler check than what you need to detect this mistake in asyncio programs (which is what #2499 is about), but it's also of less universal applicability, so Jukka suggested it should be a plugin.

This check needs to look at each call site, and get two pieces of information: (1) does this call return an Awaitable?, (2) if so, is it the direct child of an Await node?

The existing function/method hooks only let you define one hook per call site, and are used to determine the call return type in complicated cases (e.g. open). So logically our new check ought to run after those hooks, and certainly doesn't want to replace them.

Jukka suggested adding a new type of "check" hook that runs at that stage, and can report issues but doesn't mutate types.

I was originally thinking that this check might actually want to mutate types. Rationale: If someone writes a = some_async_func(); a.some_method(), then there are potentially two errors you can report: (1) missing await in front of some_async_func; (2) coroutine object has no attribute some_method. Obviously the first error is the real one; the second one is just shrapnel from the first mistake, and maybe you don't even want to print it, because it's just going to distract the user from the real mistake. So maybe to get the highest quality error messages, after reporting the missing await you might want to coerce the call expression's type to make it act like the await was actually there, so that follow-up errors are more useful. OTOH, I realized that if for some reason you really did want to skip an await (e.g., because you're running this check over trio's internals, which do have to create bare coroutine objects in a few places), then you need some way to do that, and # type: ignore can suppress an error message but it can't suppress the type coercion. So maybe we should just live with the shrapnel and a pure "check" hook is good enough? Throwing this out there in case you experts have any thoughts :-).

Anyway, if we go with the check hook, there are two possible approaches that come to mind:

  • we could have local/specialized check_function/check_method hooks that run on function/method calls, similar to the existing hooks, and give them a custom Context that includes information about whether the parent node is an Await.

  • we could have a very generic check_file hook that runs after the file has been fully type-annotated (so I guess after checker phase 2?), and can walk the tree and do whatever it wants with what it sees.

The latter has the advantage that you can imagine other plugins taking advantage of this API besides just this one, and that you wouldn't have to hard-code strange things like parent-node-is-an-Await, because the check hook could compute this as it walked the tree. It might also be useful if we want to get really fancy and add trio-asyncio support: with trio-asyncio you can have parts of the file that are trio code, and other parts that are asyncio code, and a checker could potentially figure out which is which and only complain about missing awaits in the trio code... so that all seems pretty promising.

But what I'm not as sure of is whether there's actually a good moment to run a hook like this. Does mypy ever actually store all the types, or just compute them on the fly? (It looks like the checker's type_map attribute may store them?) Would it cause any problems with all the fancy incremental/caching stuff that mypy does these days?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions