Hey! I noticed that when a tool result policy is blocked, the agent can confuse result blocking with call blocking. I could not find this mentioned in GH, looks like bug
I also checked the code, and it seems the model-facing tool result is replaced with `[Content blocked by policy: ...]`, without preserving that the tool call was actually executed
Matvey Kukuy (archestra team) —
If the tool call is blocked, it shouldn't be executed. If it's so, it's a critical bug!
Matvey Kukuy (archestra team) —
Could you please double-check?
Vadim Larin —
call was require approve, how it was wrote on screen
Vadim Larin —
to clarify
Policy - issue_write
Call policy - Allow or require approval
Call result policy - Blocked
Call was complited, issue was created, agent reply that issue action was blocked by policy instead of replying that just result was blocked so the user may never understand that the tool call actually succeeded, and the response is misleading
Vadim Larin —
joey (archestra team) —
hi 👋 out of curiousity, what model are you using here in that chat session?
The tool call succeeded it looks like - if you visit `url` in the result, does the github issue exist?
Vadim Larin —
I used gpt-5.4-mini
Yes, the GH issue exists. That is exactly the confusing part, the tool call succeeded and created the issue but the final agent response says it was blocked by policy. So from the user side it looks like the action did not happen, while only the result was blocked after execution.
joey (archestra team) —
I think there might be a few concepts getting mixed up here.
The "sensitive context below" red line is related to "tool result policies" (see docs here). The _default_ policy for tools is:
> *Sensitive*: The result is treated as sensitive or risky context for later decisions
Once the context has "become sensitive", it _can_ impact subsequent tool calls (depending on the tool call policies (docs) of the tool being used.
does that clarify things a bit?
joey (archestra team) —
For example - let's say you have an agent with two tools:
• `read_salaries`
◦ tool result policy = sensitive
• `send_emails`
◦ tool call policy = require approval
if the agent were to use the `read_salaries` tool, that would mark the agent's context as "sensitive" - which would now mean that `send_emails` tool will require human approval for it to execute (you could also configure it to be blocked or to allow it only in certain situations (ie. can only send emails to `@mycompany.com`))
joey (archestra team) —
what happens if you try that same prompt, with the same agent, in a fresh session using `gpt-5.4`?
I think the `-nano` model is maybe getting confused by the presence of the tool call `args.body` literally saying:
> issue_write - tool call policy - require approval\ntool result policy - blocked
no tool calls were actually blocked (these would be very explicitly represented in the chat ui)
Vadim Larin —
Yes, I understand that.
My point is narrower: in this case the GitHub issue was already created, but the final response says it was blocked by policy. That makes it look like the action did not happen, while only the result/context was blocked after execution.
joey (archestra team) —
right but this is the model's literal output
joey (archestra team) —
try again with `gpt-5.4` and let me know what you see 🙂
Vadim Larin —
Right, but the system knows the real tool call status. If the call succeeded and only the result was blocked, the UI/response should make that explicit, otherwise the model output is misleading.
My name is Abdellah Bouarguan. I've been diving into Archestra recently and find the project and what you're building incredibly interesting!
I would love to work on the UI state desync and concurrency vulnerability (Issue #4030).
---
About me
Computer Science engineering student at ENSA Tétouan
Massive Linux enthusiast (been using it since I was 9)
~6 years of self-taught full-stack development (pre-AI era)
---
Proposed approach
I’m looking to tackle this issue not just for the bounty, but as a practical architecture assignment for my engineering program.
My current plan to fix the double-execution race condition is:
Implement a Postgres-backed distributed lock
- Using an "ON CONFLICT" write-lock
- Located in "chat-mcp-client.ts"
Add a state-cleansing step in "normalizeChatMessages.ts"
- Flip abandoned "approval-requested" states → "output-denied"
- Before DB persistence
---
Since the contributing docs highly recommend syncing with the core team first, I wanted to drop in and say hi!
Do you have any specific guides, architectural design choices, or feedback on this approach before I officially post my "/attempt" claim and spin up a PR?
Hey everyone! Just joined and set up Archestra locally — really impressive project.
I've been going through the open issues and noticed the swap_agent bug in Slack/MS Teams (#4011). That caught my eye because I recently built a multi-channel AI bot with RAG and chatops integrations (Telegram + Discord), so I'm comfortable in that space.
Are there areas especially around agent triggers or chatops where the community could use more testing or bug reports? Happy to explore in and contribute properly.
Hey! I noticed that when a tool result policy is blocked, the agent can confuse result blocking with call blocking. I could not find this mentioned in GH, looks like bug
Hey! I noticed that when a tool result policy is blocked, the agent can confuse result blocking with call blocking. I could not find this mentioned in GH, looks like bug
I also checked the code, and it seems the model-facing tool result is replaced with [Content blocked by policy: ...], without preserving that the tool call was actually executed
Call was complited, issue was created, agent reply that issue action was blocked by policy instead of replying that just result was blocked so the user may never understand that the tool call actually succeeded, and the response is misleading
Yes, the GH issue exists. That is exactly the confusing part, the tool call succeeded and created the issue but the final agent response says it was blocked by policy. So from the user side it looks like the action did not happen, while only the result was blocked after execution.
For example - let's say you have an agent with two tools:
• read_salaries
◦ tool result policy = sensitive
• send_emails
◦ tool call policy = require approval
if the agent were to use the read_salaries tool, that would mark the agent's context as "sensitive" - which would now mean that send_emails tool will require human approval for it to execute (you could also configure it to be blocked or to allow it only in certain situations (ie. can only send emails to @mycompany.com))
My point is narrower: in this case the GitHub issue was already created, but the final response says it was blocked by policy. That makes it look like the action did not happen, while only the result/context was blocked after execution.
Right, but the system knows the real tool call status. If the call succeeded and only the result was blocked, the UI/response should make that explicit, otherwise the model output is misleading.