tool use without auto execution by jingyun19 · Pull Request #162 · webmachinelearning/prompt-api

jingyun19 · 2025-11-19T18:02:24Z

Update explainer and spec to support tool use functionalities without automatic execution.

Explainer: added an example and explained how to make tool calls
Spec: reflect IDL changes in https://chromium-review.googlesource.com/c/chromium/src/+/7092943

Preview | Diff

Added detailed explanations for tool use modes, including examples for open loop and closed loop execution.

Updated README to clarify tool-call and tool-result usage.

Added new types and enums for tool calls and responses.

jingyun19 · 2025-11-19T18:06:50Z

cc @FrankLi-MSFT @sushraja-msft @reillyeon @michaelwasserman

tomayac

Tried to make the code samples more readable and correct. Maybe consider running them all through a tool like prettier, which catches typos like missing commas or parentheses.

As general feedback, could the explainer outline why developers would choose closed vs. open?

tomayac · 2025-11-26T10:18:08Z

+await session.append([
+    {role: "user", content: "What is the weather in Seattle?"},
+    {role: "tool-call", content: {type: "tool-call", value: {callID:" get_weather_1", name: "get_weather", arguments: {location:"Seattle"}}},
+    {role: "tool-result", content:  {type: "tool-response", value: {callID: "get_weather_1", name: "get_weather", result: [{type:"object", value: {temperature: "55F", humidity: "67%"}}]}},
+    {role: "assistant", content: "The temperature in Seattle is 55F and humidity is 67%"},
+]);


Suggested change

await session.append([

{role: "user", content: "What is the weather in Seattle?"},

{role: "tool-call", content: {type: "tool-call", value: {callID:" get_weather_1", name: "get_weather", arguments: {location:"Seattle"}}},

{role: "tool-result", content: {type: "tool-response", value: {callID: "get_weather_1", name: "get_weather", result: [{type:"object", value: {temperature: "55F", humidity: "67%"}}]}},

{role: "assistant", content: "The temperature in Seattle is 55F and humidity is 67%"},

]);

await session.append([

{ role: "user", content: "What is the weather in Seattle?" },

{

role: "tool-call",

content: {

type: "tool-call",

value: {

callID: " get_weather_1",

name: "get_weather",

arguments: { location: "Seattle" },

},

},

},

{

role: "tool-result",

content: {

type: "tool-response",

value: {

callID: "get_weather_1",

name: "get_weather",

result: [

{ type: "object", value: { temperature: "55F", humidity: "67%" } },

],

},

},

},

{

role: "assistant",

content: "The temperature in Seattle is 55F and humidity is 67%",

},

]);

tomayac · 2025-11-26T10:18:40Z

+]);
+```
+
+Note that "role" and "type" now supports "tool-call" and "tool-result". 


Suggested change

Note that "role" and "type" now supports "tool-call" and "tool-result".

Note that `"role"` and `"type"` now support `"tool-call"` and `"tool-result"`.

tomayac · 2025-11-26T10:19:47Z

+sessionOptions = structuredClone(options);
+sessionOptions.expectedOutputs.push(["tool-call"]);
+session = await LanguageModel.create(sessionOptions);
+
+var result = await session.prompt("What is the weather in Seattle?");
+if (result.type=="tool-call") {
+  if (result.name == "get_weather") {
+    const tool_result = getWeather(result.arguments.location);
+    result = session.prompt([{role:"tool-result", content: {type: "tool-result", value: {callId: result.callID, name: result.name, result: [{type:"object", value: tool_result}]}}}])
+  }
+} else{
+  console.log(result)
+}


Suggested change

sessionOptions = structuredClone(options);

sessionOptions.expectedOutputs.push(["tool-call"]);

session = await LanguageModel.create(sessionOptions);

var result = await session.prompt("What is the weather in Seattle?");

if (result.type=="tool-call") {

if (result.name == "get_weather") {

const tool_result = getWeather(result.arguments.location);

result = session.prompt([{role:"tool-result", content: {type: "tool-result", value: {callId: result.callID, name: result.name, result: [{type:"object", value: tool_result}]}}}])

}

} else{

console.log(result)

}

sessionOptions = structuredClone(options);

sessionOptions.expectedOutputs.push(["tool-call"]);

session = await LanguageModel.create(sessionOptions);

var result = await session.prompt("What is the weather in Seattle?");

if (result.type == "tool-call") {

if (result.name == "get_weather") {

const tool_result = getWeather(result.arguments.location);

result = session.prompt([

{

role: "tool-result",

content: {

type: "tool-result",

value: {

callId: result.callID,

name: result.name,

result: [{ type: "object", value: tool_result }],

},

},

},

]);

}

} else {

console.log(result);

}

tomayac · 2025-11-26T10:20:13Z

+
+#### Closed Loop:
+
+To enable automatic execution, add a `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:


Suggested change

To enable automatic execution, add a `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:

To enable automatic execution, add an `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:

tomayac · 2025-11-26T10:22:08Z

+sessionOptions.expectedOutputs.push(["tool-call"]);
+session = await LanguageModel.create(sessionOptions);
+
+var result = await session.prompt("What is the weather in Seattle?");


Suggested change

var result = await session.prompt("What is the weather in Seattle?");

let result = await session.prompt("What is the weather in Seattle?");

tomayac · 2025-11-26T10:22:25Z

+Example:
+
+```js
+sessionOptions = structuredClone(options);


Suggested change

sessionOptions = structuredClone(options);

const sessionOptions = structuredClone(options);

tomayac · 2025-11-26T10:22:47Z

+```js
+sessionOptions = structuredClone(options);
+sessionOptions.expectedOutputs.push(["tool-call"]);
+session = await LanguageModel.create(sessionOptions);


Suggested change

session = await LanguageModel.create(sessionOptions);

const session = await LanguageModel.create(sessionOptions);

Co-authored-by: Thomas Steiner <tomac@google.com>

Added explanation about automatic execution and constraints in planner loop.

jingyun19 · 2025-12-01T17:30:30Z

I added a new section to describe use cases where open loop is preferred. cc @tomayac

reillyeon · 2025-12-11T01:33:53Z

I hadn't previously considered the context compression use case. That's interesting and motivating to enable developers to manipulate the conversation at this low level.

nico-martin

I think this implementation has a few weaknesses when it comes to distinguishing between a Message and a Message.Content element:
Message: Can have a specific role (whether it comes from the user, the assistant, or a tool call); it essentially describes the sender.
Message.Content: There can be multiple instances per Message; it describes the type of content.
I tried to make this concrete with a couple of comments.

nico-martin · 2026-06-23T06:39:41Z

+enum LanguageModelMessageRole { "system", "user", "assistant", "tool-call", "tool-response" };

-enum LanguageModelMessageType { "text", "image", "audio" };
+enum LanguageModelMessageType { "text", "image", "audio","tool-call", "tool-response"  };


In my opinion, the MessageType should describe the type, or in other words "the modality".
MessageRole: Where does the message come from (who is the sender?)
MessageType: What is the content type of the message (text, image, audio)

Is there a specific reason why I should return a message where the LanguageModelMessageRole = tool-call AND the LanguageModelMessageType = tool-call?

I remembered we created "tool-call" for LanguageModelMessageRole so that different browsers can easily format the role for its model-specific implementation. E.g, empty string, or "" or some different control tokens.

If only MessageType supports "tool-call" type and we assumed the role to be "assistant", then the implementation might becomes more complicated than declaring it anyhow and let implementation omit it.

It's also not without precedence that separate roles are defined for tool call and tool response: https://huggingface.co/Trelis/openchat_3.5-function-calling-v3

nico-martin · 2026-06-23T06:43:58Z

+// The definitions of `LanguageModelToolCall` and `LanguageModelToolResponse` values
+enum LanguageModelToolResultType { "text", "image", "audio", "object" };
+
+dictionary LanguageModelToolResultContent {


In the end, the result of a tool call is nothing else than a new message in the conversation. But now the message is not coming from the assistant or the user, but from a tool. So I think we should align this with the LanguageModelMessageContent.

nico-martin · 2026-06-23T06:58:57Z

+```js
+await session.append([
+    {role: "user", content: "What is the weather in Seattle?"},
+    {role: "tool-call", content: {type: "tool-call", value: {callID:" get_weather_1", name: "get_weather", arguments: {location:"Seattle"}}},


From a technical point of view, tool-call is not a message in the conversation. assistant would be the role and the content could then include tool-call tokens.
So in my opinion the assistant message should always return the actual generated tokens, but for convenience (and cross-model compatibility) also the parsed toolcalls (btw. thats also the same for thinking)

{ role: "assistant", content: "...", toolCalls: [ { callId: "get_weather_1", name: "get_weather", arguments: { location: "Seattle" } } ] }

It's not clear what the client can do with the actual tokens if they have the parsed tool calls anyways. Also, the actual tokens and parsers for function calling are browser- and model-specific implementations, so I'm not sure if all browsers want to expose to client.

(How we expose thinking is a different topic that worth its separate discussion)

nico-martin · 2026-06-23T07:05:08Z

+await session.append([
+    {role: "user", content: "What is the weather in Seattle?"},
+    {role: "tool-call", content: {type: "tool-call", value: {callID:" get_weather_1", name: "get_weather", arguments: {location:"Seattle"}}},
+    {role: "tool-result", content:  {type: "tool-response", value: {callID: "get_weather_1", name: "get_weather", result: [{type:"object", value: {temperature: "55F", humidity: "67%"}}]}},


As mentioned above, role: "tool-result"+ conten.type: "tool-response"seems unnecessary.
Whit this structure we have a role tool-result and then, just like all the other messages we have a content array where each element can have a different type, depending on what the tools wants to return.

{ role: "tool-result", content: [ { callId: "get_weather_1", type: "object", value: [ { type: "object", value: { temperature: "55F", humidity: "67%" } } ] } ] }

nico-martin · 2026-06-23T07:13:45Z

+
+#### Do I need auto execution?
+
+In general, automatic execution is suitable for use cases where the model quality is good enough via prompt tuning. That can either mean you are tolerable for certain mistakes that the model makes when making tool calls, or the task is simple enough for the model to handle (e.g, just a few distinct tools, short and clean tool output, short context window, etc)


Not sure if I get this right. For the user, both, the closed and the open loop, are executed automatically. The only difference is that in an open loop, the developer has to execute the tools and start the next generation, while in the closed loop the loop will run without any extra steps.
Also if I dont want to have "automatic execution" as a developer, I could always intercept in the execute function. I would even argue for the wohle LLM conversation it is better to intercept a tool execution inside the execute function. Because then it allows you to return a reason why the tool was not executed intead of letting the model generate the tool call and then it does not know why it was not executed.

jingyun19 added 3 commits November 18, 2025 14:16

Enhance README with tool use modes and examples

14d59fc

Added detailed explanations for tool use modes, including examples for open loop and closed loop execution.

Clarify tool-call and tool-result in README

5379da1

Updated README to clarify tool-call and tool-result usage.

Enhance LanguageModel with tool call definitions

8e076f7

Added new types and enums for tool calls and responses.

tomayac reviewed Nov 26, 2025

View reviewed changes

jingyun19 and others added 5 commits November 26, 2025 09:08

Update README.md

ed78c12

Co-authored-by: Thomas Steiner <tomac@google.com>

Update README.md

5b11a1b

Co-authored-by: Thomas Steiner <tomac@google.com>

Merge branch 'webmachinelearning:main' into patch-1

cb8d807

add open vs. loop use case diff

5a0cbef

Enhance README with planner loop constraints details

81e75af

Added explanation about automatic execution and constraints in planner loop.

franzellekirk01-del approved these changes Jun 2, 2026

View reviewed changes

tomayac mentioned this pull request Jun 23, 2026

Tool calling: return types? #138

Open

nico-martin reviewed Jun 23, 2026

View reviewed changes

	Note that "role" and "type" now supports "tool-call" and "tool-result".
	Note that `"role"` and `"type"` now support `"tool-call"` and `"tool-result"`.


		#### Closed Loop:

		To enable automatic execution, add a `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:

	To enable automatic execution, add a `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:
	To enable automatic execution, add an `execute` function for each tool's implementation, and add a `toolUseConfig` to indicate that execution is enabled and pose a max number of tool calls invoked in a single session generation:

	var result = await session.prompt("What is the weather in Seattle?");
	let result = await session.prompt("What is the weather in Seattle?");

	sessionOptions = structuredClone(options);
	const sessionOptions = structuredClone(options);

	session = await LanguageModel.create(sessionOptions);
	const session = await LanguageModel.create(sessionOptions);


		#### Do I need auto execution?

		In general, automatic execution is suitable for use cases where the model quality is good enough via prompt tuning. That can either mean you are tolerable for certain mistakes that the model makes when making tool calls, or the task is simple enough for the model to handle (e.g, just a few distinct tools, short and clean tool output, short context window, etc)

Uh oh!

Conversation

jingyun19 commented Nov 19, 2025 • edited by pr-preview Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jingyun19 commented Nov 19, 2025

Uh oh!

tomayac left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jingyun19 commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

reillyeon commented Dec 11, 2025

Uh oh!

nico-martin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jingyun19 commented Nov 19, 2025 •

edited by pr-preview Bot

Loading

jingyun19 commented Dec 1, 2025 •

edited

Loading