Support for tool/function calling #7

christianliebel · 2024-07-02T09:57:31Z

I would like to request the addition of tool/function calling functionality to the Prompt API. This feature is available in some models and allows the model to invoke specific actions using a well-defined contract, typically in JSON format. This functionality is beneficial for various use cases that require outputs of a specific structure.

Examples:

Closes #28. See also #7.

KenjiBaheux · 2024-08-20T08:58:09Z

Thank you for the suggestion to add tool/function calling functionality.

To assess the feasibility of this feature, we would appreciate it if folks could provide more details on the typical context size required for defining the functions needed in actual use cases. This information will help us understand the potential impact on performance and resource requirements, especially for an on-device context.

CakeCrusher · 2024-09-30T23:47:12Z

Function calling at this size of model is not practical. Although constrained generation would be a practical intermediate solution for this.
You can delegate constrained generation to the user by giving them access to output logits.

This is the most practical approach IMO

ChristianWeyer · 2024-10-02T11:46:09Z

Function Calling may indeed be too heavy for those models.
However, models fine-tuned for JSON data extraction would be really helpful. Then we can use approved patterns that are e.g. implemented in Instructor (https://js.useinstructor.com/)

Closes #28. See also #7.

schlessera · 2024-11-07T10:51:45Z

Function calling might be viable for specially tuned models even at lower sizes. Having a flexible API in place to allow for function execution would allow experimentation for that. There might be alternative ways to make this work with limited resources, like having dedicated helper logic in the browser to structure input and output, or extracting arguments and formatting them. And with advances in training the smaller models, they could still drastically improve in that area. There could also be a path way where two smaller models with separate responsibilities could collaborate, one for understanding the generic language and one for reasoning about solving the problem at hand outside of the boundaries of human languages.

If the model can, within a certain threshold of reliability, assess whether a function might be adapted to solve an identified task, it can state so and pass arguments to back to the consumer code. The consumer code could then opt to either run that function directly within the browser thread, or forward it to a service worker. It might even make sense to default to service workers as the default execution model, so that the in-browser API knows about the functions and the service worker to execute them in, and the entire flow can be executed without requiring intermediate assistance by the main thread.

This could even allow for browser extensions to provide a set of standard functions to be called to augment the capabilities of an LLM model in an easy way for end users (provided the security implications are correctly handled).

domenic · 2025-06-05T06:17:17Z

We think this would be an exciting next step.

Looking at popular APIs today (OpenAI HTTP docs, OpenAI TypeScript Agents SDK docs, Anthropic docs, Gemini docs, Vercel AI SDK docs) it seems like the common info for each tool is:

Name
Description (natural language)
Input arguments JSON schema

For a JavaScript (instead of HTTP) API, we can also provide the functions directly.

It's interesting to compare these docs to the MCP docs, which add annotations. I think those are probably less important for our use case? And we can always add them later. So, I think the following is a good starting point:

const result = await session.prompt(
  "What is the weather like in San Francisco?",
  {
    tools: [
      {
        name: 'weather',
        description: 'Get the weather in a location',
        inputSchema: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA"
            }
          },
          required: ["location"]
        },
        async execute({ location }) {
          const res = await fetch("https://weatherapi.example/?location=" + location);

          // We could allow returning JS objects directly and stringifying them, but that
          // seems risky...
          return res.json().toString();
        }
      }
    ]
  }
);

Some minor points to bikeshed:

I like the Vercel AI SDK's use of object literals instead of an array of named items, e.g. I think tools: { weather: { description, parameters, execute } } would be nicer than the above. However, nobody else does that. And in the future, if we wanted to offer built-in tools, the above design is more flexible, since the built-in tools could be entries in the array such as "builtin:toolname" or LanguageModel.ToolName. So, I am inclined to stick with the array version.
There's a split between input_schema (Anthropic API), inputSchema (MCP), and parameters (everyone else). I find "input schema" to be a good bit clearer, so I went with that, but I could be persuaded to align with the majority.
The name execute() has many possible alternatives, e.g. run() could be another possibility, but I've found at least two places (OpenAI Agents SDK and Vercel AI SDK) using execute() so far.
It's not clear whether the name is necessary for our JavaScript API use case. In the HTTP APIs, it's used so that in the HTTP response, the model can tell the developer which tool it's calling. In our case, that whole process is hidden from the developer since they just provide a JavaScript function. So in theory it's not necessary. But I do suspect that models might get better performance with named tools. If we auto-generated tool names behind the scenes (like tool0, tool1, etc.) I wonder if the models might mess up more, compared to developer-supplied semantic names. We should probably test this!

tomayac · 2025-06-05T07:20:53Z

How do you envision error handling to work (Vercel example)?

domenic · 2025-06-05T07:33:51Z

Goood question.

Rejected promises from execute() should just cause the prompt to fail, bubbling the error to the return value of prompt().
I guess we'll have to vend specific errors for cases where the model screws up, like Vercel's InvalidToolArgumentsError and NoSuchToolError?
In all cases, errors should wipe all the related messages from the session.

sushraja-msft · 2025-06-05T18:43:55Z

Thanks for taking this forward domenic. I think the tool declaration needs to be on LanguageModel.create and is a session attribute

Having the developer provide available tools on each turn would be redundant and also confusing for the model if the available tools change with each turn.
Some models (Phi4) expect the tools to be declared in the system prompt.

domenic · 2025-06-06T00:52:48Z

Interesting. That seems like a reasonable restriction to me. Creating sessions shouldn't be that hard for developers.

So, concretely, we'd place them in LanguageModelCreateCoreOptions.

nico-martin · 2025-06-09T20:20:34Z

I created quite a lot of demos for tool-calling with small LLMs.
Most of them dont support tool-calling directly, so I built my own tooling around the LLM where the tools are described in the system Prompt and then I parse the response for potential tool-calls.

Here are some of my findings:

Yes, even small LLMs (like Gemini Nano, Gemma2 2B, Qwen3 1.7B) are actually not that bad with tool-calling. It always depends on how much context is already in the conversation, how many functions you have and how well your functions are described, but for simple usecases it does work.
JSON does not work very well. If you force the LLM in the system Prompt to always return the same structure, it often forgets about that on the second or third anwser or it just outputs invalid JSON (semocolon here, comma there, etc.). Thats why I am using XML for the tool-calling.

I just added Prompt API support in my LLM tool-calling demo. Feel free to try it yourself:
https://llm-tool-calling.nico.dev/ --> (click on the model below the input, in the Settings Modal -> Model search for the "Prompt API")

Works quite nice even with two calls in one prompt:

domenic added the enhancement New feature or request label Jul 29, 2024

domenic added a commit that referenced this issue Aug 14, 2024

Allow assistant-role prompts and multiple prompts

d6fdaad

Closes #28. See also #7.

domenic added a commit that referenced this issue Aug 14, 2024

Allow assistant-role prompts and multiple prompts

eb6ec5d

Closes #28. See also #7.

domenic mentioned this issue Aug 14, 2024

Allow assistant-role prompts and multiple prompts #32

Merged

domenic added a commit that referenced this issue Oct 9, 2024

Allow assistant-role prompts and multiple prompts

adf7435

Closes #28. See also #7.

domenic added a commit that referenced this issue Oct 9, 2024

Allow assistant-role prompts and multiple prompts

3358d7f

Closes #28. See also #7.

This was referenced Dec 3, 2024

JSON mode #65

Closed

Function calling: will the prompt api support it? webmachinelearning/writing-assistance-apis#20

Closed

tomayac mentioned this issue Dec 5, 2024

Function calling: will the prompt api (or other writing assistance apis) support it? webmachinelearning/writing-assistance-apis#19

Closed

AdamSobieski mentioned this issue Apr 2, 2025

[FR] Dynamic Tool Management #96

Closed

christianliebel mentioned this issue Apr 6, 2025

[FR] Add MCP Support #100

Open

domenic added the ecosystem parity A feature that other popular language model APIs offer label Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for tool/function calling #7

Support for tool/function calling #7

christianliebel commented Jul 2, 2024

KenjiBaheux commented Aug 20, 2024

Uh oh!

CakeCrusher commented Sep 30, 2024

Uh oh!

ChristianWeyer commented Oct 2, 2024

Uh oh!

schlessera commented Nov 7, 2024

Uh oh!

domenic commented Jun 5, 2025 •

edited

Loading

Uh oh!

tomayac commented Jun 5, 2025

Uh oh!

domenic commented Jun 5, 2025

Uh oh!

sushraja-msft commented Jun 5, 2025

Uh oh!

domenic commented Jun 6, 2025

Uh oh!

nico-martin commented Jun 9, 2025

Uh oh!

Support for tool/function calling #7

Support for tool/function calling #7

Comments

christianliebel commented Jul 2, 2024

KenjiBaheux commented Aug 20, 2024

Uh oh!

CakeCrusher commented Sep 30, 2024

Uh oh!

ChristianWeyer commented Oct 2, 2024

Uh oh!

schlessera commented Nov 7, 2024

Uh oh!

domenic commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomayac commented Jun 5, 2025

Uh oh!

domenic commented Jun 5, 2025

Uh oh!

sushraja-msft commented Jun 5, 2025

Uh oh!

domenic commented Jun 6, 2025

Uh oh!

nico-martin commented Jun 9, 2025

Uh oh!

domenic commented Jun 5, 2025 •

edited

Loading