Thread

VL
Vadim Larin1:36 PMOpen in Slack
Hey! I noticed that when a tool result policy is blocked, the agent can confuse result blocking with call blocking. I could not find this mentioned in GH, looks like bug

17 replies
VL
Vadim Larin1:37 PMOpen in Slack
I also checked the code, and it seems the model-facing tool result is replaced with [Content blocked by policy: ...], without preserving that the tool call was actually executed
MK
Matvey Kukuy (archestra team)1:59 PMOpen in Slack
If the tool call is blocked, it shouldn't be executed. If it's so, it's a critical bug!
MK
Matvey Kukuy (archestra team)1:59 PMOpen in Slack
Could you please double-check?
VL
Vadim Larin2:08 PMOpen in Slack
call was require approve, how it was wrote on screen
VL
Vadim Larin2:22 PMOpen in Slack
to clarify
Policy - issue_write
Call policy - Allow or require approval
Call result policy - Blocked
Call was complited, issue was created, agent reply that issue action was blocked by policy instead of replying that just result was blocked so the user may never understand that the tool call actually succeeded, and the response is misleading
J(
joey (archestra team)2:24 PMOpen in Slack
hi 👋 out of curiousity, what model are you using here in that chat session?
The tool call succeeded it looks like - if you visit url in the result, does the github issue exist?
👋1
VL
Vadim Larin2:26 PMOpen in Slack
I used gpt-5.4-mini
Yes, the GH issue exists. That is exactly the confusing part, the tool call succeeded and created the issue but the final agent response says it was blocked by policy. So from the user side it looks like the action did not happen, while only the result was blocked after execution.
J(
joey (archestra team)2:27 PMOpen in Slack
I think there might be a few concepts getting mixed up here.
The "sensitive context below" red line is related to "tool result policies" (see docs here). The default policy for tools is:
Sensitive: The result is treated as sensitive or risky context for later decisions
Once the context has "become sensitive", it can impact subsequent tool calls (depending on the tool call policies (docs) of the tool being used.
does that clarify things a bit?
J(
joey (archestra team)2:29 PMOpen in Slack
For example - let's say you have an agent with two tools:
read_salaries
◦ tool result policy = sensitive
send_emails
◦ tool call policy = require approval
if the agent were to use the read_salaries tool, that would mark the agent's context as "sensitive" - which would now mean that send_emails tool will require human approval for it to execute (you could also configure it to be blocked or to allow it only in certain situations (ie. can only send emails to @mycompany.com))
J(
joey (archestra team)2:31 PMOpen in Slack
what happens if you try that same prompt, with the same agent, in a fresh session using gpt-5.4?
I think the -nano model is maybe getting confused by the presence of the tool call args.body literally saying:
issue_write - tool call policy - require approval\ntool result policy - blocked
no tool calls were actually blocked (these would be very explicitly represented in the chat ui)
VL
Vadim Larin2:31 PMOpen in Slack
Yes, I understand that.
My point is narrower: in this case the GitHub issue was already created, but the final response says it was blocked by policy. That makes it look like the action did not happen, while only the result/context was blocked after execution.
J(
joey (archestra team)2:32 PMOpen in Slack
right but this is the model's literal output
J(
joey (archestra team)2:33 PMOpen in Slack
try again with gpt-5.4 and let me know what you see 🙂
VL
Vadim Larin2:33 PMOpen in Slack
Right, but the system knows the real tool call status. If the call succeeded and only the result was blocked, the UI/response should make that explicit, otherwise the model output is misleading.
VL
Vadim Larin2:34 PMOpen in Slack
with 5.4 the same