Skip to content

Scaffold IPC-based API #711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 71 commits into from
Apr 10, 2025
Merged

Scaffold IPC-based API #711

merged 71 commits into from
Apr 10, 2025

Conversation

andrewbranch
Copy link
Member

@andrewbranch andrewbranch commented Mar 25, 2025

Important

Until libsyncrpc is set up to publish to npm, this PR takes a git dependency on it, which will build the binary from source during npm install. You need Rust 1.85 or higher to have a successful npm install in typescript-go.

Note

Takeaways from design meeting:

  • Investigate some kind of query system for composing batched requests. Muffled screams of “Not GraphQL!” and “I want GraphQL!” could be heard on the other end of the line.
  • FFI via napi-go still on the table to investigate as an additional option (not replacing IPC), but may need a different API surface or refactoring since the one included here returns shallow serializable objects.
  • No consensus on whether remoting ASTs means we don’t need to provide a client-side parser. The performance of remoting is pretty good, but requires loading the large tsgo binary, and the perf maybe isn’t good enough for linters that need to remote thousands of ASTs. There are other JS parsers though, and not having to maintain one with identical behavior to the Go-based parser is seen as a real win.

This PR is the start of a JavaScript API client and Go API server that communicate over STDIO. Only a few methods are implemented; the aim of this PR is to be the basis for discussions around the general architecture, then additional functionality can be filled in.

Same backend, different clients

This PR includes a synchronous JavaScript client for Node.js. It uses libsyncrpc to block during IPC calls to the server. Relatively small changes to the client could produce an asynchronous variant without Node.js-specific native bindings that could work in Deno or Bun. I don’t want to make specific promises about WASM without doing those experiments, but using the same async client with an adapter for calling WASM exports seems possible. I’m imagining that eventually we’ll publish the Node.js-specific sync client as a standalone library for those who need a sync API, and an async version adaptable to other use cases, ideally codegen’d from the same source. The same backend is intended to be used with any out-of-process client.

Client structure

This PR creates two JavaScript packages, @typescript/ast and @typescript/api (which may make more sense as @typescript/api-sync or @typescript/api-node eventually). The former contains a copy of TS 5.9’s AST node definitions, related enums, and node tests (e.g. isIdentifier()), with the minor changes that TS 7 has made to those definitions applied. The latter contains the implementation of the Node.js API client. It currently takes a path to the tsgo executable and spawns it as a child process. (I imagine eventually, the TypeScript 7.0+ compiler npm package will be a peerDependency of the API client, and resolution of the executable can happen automatically.)

Backend structure

tsgo api starts the API server communicating over STDIO. The server initializes the api.API struct which is responsible for handling requests and managing state, like a stripped-down project.Service. In fact, it uses the other components of the project system, storing documents and projects the same way. (As the project service gets built out with things like file watchers and optimizations for find-all-references, it would get increasingly unwieldy to use directly as an API service, but a future refactor might extract the basic project and document storage to a shared component.)

The API already has methods that return projects, symbols, and types. These are returned as IDs plus bits of easily serializable info, like name and flags. When one of these objects is requested, the API server stores it with its ID so follow-up requests can be made against those IDs. This does create some memory management challenges, which I’ll discuss a bit later.

Implemented functionality

Here’s a selection of the API client type definitions that shows what methods exist as of this PR:

export interface APIOptions {
    tsserverPath: string;
    cwd?: string;
    logFile?: string;
    fs?: FileSystem;
}

export interface FileSystem {
    directoryExists?: (directoryName: string) => boolean | undefined;
    fileExists?: (fileName: string) => boolean | undefined;
    getAccessibleEntries?: (directoryName: string) => FileSystemEntries | undefined;
    readFile?: (fileName: string) => string | null | undefined;
    realpath?: (path: string) => string | undefined;
}

export declare class API {
    constructor(options: APIOptions);
    parseConfigFile(fileName: string): ConfigResponse;
    loadProject(configFileName: string): Project;
}

export interface ConfigResponse {
    options: Record<string, unknown>;
    fileNames: string[];
}

export declare class Project {
    configFileName: string;
    compilerOptions: Record<string, unknown>;
    rootFiles: readonly string[];

    reload(): void;
    getSourceFile(fileName: string): SourceFile | undefined;
    getSymbolAtLocation(node: Node): Symbol | undefined;
    getSymbolAtLocation(nodes: readonly Node[]): Symbol | undefined;
    getSymbolAtPosition(fileName: string, position: number): Symbol | undefined;
    getSymbolAtPosition(fileName: string, positions: readonly number[]): (Symbol | undefined)[];
    getTypeOfSymbol(symbol: Symbol): Type | undefined;
    getTypeOfSymbol(symbols: readonly Symbol[]): (Type | undefined)[];
}

export interface Node {
    readonly id: number;
    readonly pos: number;
    readonly end: number;
    readonly kind: SyntaxKind;
    readonly parent: Node;
    forEachChild<T>(visitor: (node: Node) => T): T | undefined;
    getSourceFile(): SourceFile;
}

export interface SourceFile extends Node {
    readonly kind: SyntaxKind.SourceFile;
    // Node types are basically same as Strada, without additional methods
    readonly statements: NodeArray<Statement>;
    readonly text: string;
}

export declare class Symbol {
    id: string;
    name: string;
    flags: SymbolFlags;
    checkFlags: number;
}

export declare class Type {
    flags: TypeFlags;
}

Here’s some example usage from benchmarks:

import { API } from "@typescript/api";
import { SyntaxKind } from "@typescript/ast";

const api = new API({
    cwd: new URL("../../../", import.meta.url).pathname,
    tsserverPath: new URL("../../../built/local/tsgo", import.meta.url).pathname,
});

const project = api.loadProject("_submodules/TypeScript/src/compiler/tsconfig.json");
const file = project.getSourceFile("program.ts")!;

file.forEachChild(function visit(node) {
  if (node.kind === SyntaxKind.Identifier) {
    const symbol = project.getSymbolAtPosition("program.ts", node.pos);
    // ...
  }
  node.forEachChild(visit);
});

Client-side virtual file systems are also supported. There’s a helper for making a very simple one from a record:

import { API } from "@typescript/api";
import { createVirtualFileSystem } from "@typescript/api/fs";
import { SyntaxKind } from "@typescript/ast";

const api = new API({
    cwd: new URL("../../../", import.meta.url).pathname,
    tsserverPath: new URL("../../../built/local/tsgo", import.meta.url).pathname,
    fs: createVirtualFileSystem({
        "/tsconfig.json": "{}",
        "/src/index.ts": `import { foo } from './foo';`,
        "/src/foo.ts": `export const foo = 42;`,
    }),
});

Performance

These are the results of the included benchmarks on my M2 Mac. Note that IPC is very fast on Apple Silicon, and Windows seems to see significantly more overhead per call. Tasks prefixed TS - refer to the rough equivalent with the TypeScript 5.9 API. The getSymbolAtPosition tasks are operating on TypeScript’s program.ts, which has 10893 identifiers.

┌─────────┬─────────────────────────────────────────────────────┬─────────────────────┬───────────────────────────┬────────────────────────┬────────────────────────┬─────────┐
│ (index) │ Task name                                           │ Latency avg (ns)    │ Latency med (ns)          │ Throughput avg (ops/s) │ Throughput med (ops/s) │ Samples │
├─────────┼─────────────────────────────────────────────────────┼─────────────────────┼───────────────────────────┼────────────────────────┼────────────────────────┼─────────┤
│ 0       │ 'spawn API'                                         │ '3811417 ± 2.32%'   │ '3562750 ± 137292.00'     │ '268 ± 1.51%'          │ '281 ± 11'             │ 263     │
│ 1       │ 'echo (small string)'                               │ '10145 ± 0.30%'     │ '8792.0 ± 2375.00'        │ '116283 ± 0.24%'       │ '113740 ± 32663'       │ 98570   │
│ 2       │ 'echo (large string)'                               │ '802872 ± 1.27%'    │ '783375 ± 82458.00'       │ '1285 ± 0.86%'         │ '1277 ± 133'           │ 1246    │
│ 3       │ 'echo (small Uint8Array)'                           │ '11170 ± 0.40%'     │ '9750.0 ± 2458.00'        │ '104702 ± 0.24%'       │ '102564 ± 27325'       │ 89529   │
│ 4       │ 'echo (large Uint8Array)'                           │ '540871 ± 1.95%'    │ '498542 ± 71500.00'       │ '1989 ± 0.99%'         │ '2006 ± 285'           │ 1849    │
│ 5       │ 'load project'                                      │ '8275640 ± 19.35%'  │ '7099958 ± 337374.00'     │ '136 ± 2.83%'          │ '141 ± 7'              │ 121     │
│ 6       │ 'load project (client FS)'                          │ '95038294 ± 5.86%'  │ '93161042 ± 7466791.50'   │ '11 ± 3.38%'           │ '11 ± 1'               │ 64      │
│ 7       │ 'TS - load project'                                 │ '380236148 ± 1.55%' │ '375040917 ± 14878416.50' │ '3 ± 1.47%'            │ '3 ± 0'                │ 64      │
│ 8       │ 'transfer debug.ts'                                 │ '732232 ± 3.04%'    │ '681083 ± 12334.00'       │ '1434 ± 0.56%'         │ '1468 ± 27'            │ 1366    │
│ 9       │ 'transfer program.ts'                               │ '2660346 ± 4.04%'   │ '2431708 ± 53875.00'      │ '395 ± 1.33%'          │ '411 ± 9'              │ 377     │
│ 10      │ 'transfer checker.ts'                               │ '27890882 ± 2.28%'  │ '26820333 ± 433416.50'    │ '36 ± 1.92%'           │ '37 ± 1'               │ 64      │
│ 11      │ 'materialize program.ts'                            │ '1418572 ± 2.33%'   │ '1376500 ± 10417.00'      │ '715 ± 0.46%'          │ '726 ± 5'              │ 705     │
│ 12      │ 'materialize checker.ts'                            │ '23262525 ± 18.32%' │ '21435625 ± 1205000.00'   │ '47 ± 3.34%'           │ '47 ± 3'               │ 64      │
│ 13      │ 'getSymbolAtPosition - one location'                │ '11805 ± 1.84%'     │ '11042 ± 959.00'          │ '88571 ± 0.11%'        │ '90563 ± 8202'         │ 84712   │
│ 14      │ 'TS - getSymbolAtPosition - one location'           │ '918.37 ± 0.36%'    │ '917.00 ± 1.00'           │ '1098668 ± 0.01%'      │ '1090513 ± 1191'       │ 1088885 │
│ 15      │ 'getSymbolAtPosition - 10893 identifiers'           │ '140520504 ± 1.44%' │ '138652312 ± 2446271.00'  │ '7 ± 1.13%'            │ '7 ± 0'                │ 64      │
│ 16      │ 'getSymbolAtPosition - 10893 identifiers (batched)' │ '27321398 ± 4.48%'  │ '26289875 ± 672916.50'    │ '37 ± 2.24%'           │ '38 ± 1'               │ 64      │
│ 17      │ 'getSymbolAtLocation - 10893 identifiers'           │ '130045149 ± 1.26%' │ '128063583 ± 2391709.00'  │ '8 ± 1.09%'            │ '8 ± 0'                │ 64      │
│ 18      │ 'getSymbolAtLocation - 10893 identifiers (batched)' │ '20973507 ± 6.26%'  │ '19680167 ± 432749.50'    │ '49 ± 2.82%'           │ '51 ± 1'               │ 64      │
│ 19      │ 'TS - getSymbolAtLocation - 10893 identifiers'      │ '12942349 ± 27.49%' │ '11076708 ± 130521.00'    │ '89 ± 2.62%'           │ '90 ± 1'               │ 78      │
└─────────┴─────────────────────────────────────────────────────┴─────────────────────┴───────────────────────────┴────────────────────────┴────────────────────────┴─────────┘

To editorialize these numbers a bit: in absolute terms, this is pretty fast, even transferring large payloads like a binary-encoded checker.ts (10). On the order of tens, hundreds, or thousands of API calls, most applications probably wouldn’t notice a per-call regression over using the TypeScript 5.9 API, and may speed up if program creation / parsing multiple files is a significant portion of their API consumption today (5–7). However, the IPC overhead is pretty noticeable when looking at hundreds of thousands of back-to-back calls on an operation that would be essentially free in a native JavaScript API, like getting the symbol for every identifier in a large file (15, 18). For that reason, we’ll be very open to including bulk/batch/composite API methods that reduce the number of round trips needed to retrieve lots of information for common scenarios (16, 17).

Memory management

The current API design uses opaque IDs for objects like symbols and types, so the client can receive a handle to one of these objects and then query for additional information about it. For example, implemented in this PR is getTypeOfSymbol, which takes a symbol ID. The server has to store the symbol in a map so it can be quickly retrieved when the client asks for its type. This client/server split presents two main challenges:

  1. When the client makes two calls that result in the same symbol or same type, the client should return the strict-equal same object, while allowing garbage collection to work on those objects.
  2. When one of those client objects goes out of scope, it should eventually be released from the server, so server memory doesn’t grow indefinitely.

To accomplish this, there is a client-side object registry that stores objects by their IDs. API users will need to explicitly dispose those objects to release them both from the client-side store and from the server. (Server objects may be automatically released in response to program updates, and making additional queries against them will result in an error.) This can be done with the .dispose() method:

{
  const symbol = project.getSymbolAtPosition("program.ts", 0);
  if (symbol) {
    // ...
  }
  symbol.dispose();
}

or with explicit resource management:

{
  using symbol = project.getSymbolAtPosition("program.ts", 0);
  if (symbol) {
    // ...
  }
}

@acutmore
Copy link

acutmore commented Apr 6, 2025

I was digging through this PR and didn't spot a transpileModule API;

Just to note, that the existing transpileModule API is almost more of a conviencience API, in that under-the-hood each calls constructs a new Project that only contains the one file. So it looks like, while this PR doesn't explicitly add transpileModule it does, I think, expose (almost) enough of an API to demonstrate that TS7 should be able to expose transpileModule in a similar manor as there is the import { createVirtualFileSystem } from "@typescript/api/fs"; to create a temporary in-memory project.

@andrewbranch
Copy link
Member Author

transpileModule is not implemented here yet—this PR contains a very small fraction of methods we’ll want to expose eventually.

@jakebailey
Copy link
Member

From offline chats it sounds like ts-loader would probably not work with the new Go API in type checking mode, as ts-loader goes deep into the guts of the TypeScript APIs.

FWIW I think this is the opposite of what I was trying to say; I can't imagine us offering a public API that doesn't let one construct a program, get its errors, and get the outputs. But I don't know which "guts" you're speaking to exactly. Of course, downstream API consumers will require modification for sure, so "not work" is true from that perspective.

@andrewbranch
Copy link
Member Author

John is referring to offline chats with me, because I do know the guts in question, and ts-loader’s API usage is extremely broad, low-level, and implementation-specific.

@jakebailey
Copy link
Member

Ah, multiple private chats 😄

@andrewbranch andrewbranch enabled auto-merge April 10, 2025 22:43
@andrewbranch andrewbranch added this pull request to the merge queue Apr 10, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2025
@jakebailey
Copy link
Member

Failed to merge because the merge queue CI workflow also needs rust installed.

I should probably just delete that thing and instead use the regular CI job with a bunch of if statements, or something.

@andrewbranch andrewbranch enabled auto-merge April 10, 2025 23:11
@andrewbranch andrewbranch added this pull request to the merge queue Apr 10, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2025
@andrewbranch andrewbranch enabled auto-merge April 10, 2025 23:35
@andrewbranch andrewbranch added this pull request to the merge queue Apr 10, 2025
Merged via the queue into microsoft:main with commit a5454eb Apr 10, 2025
23 checks passed
@andrewbranch andrewbranch deleted the api branch April 10, 2025 23:51
shinichy pushed a commit to shinichy/typescript-go that referenced this pull request Apr 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants