Thread

VL
Vadim Larin3:33 PMOpen in Slack
found a bypass for Tool Result Policy = Blocked

4 replies
VL
Vadim Larin3:33 PMOpen in Slack
Repro - tool read_issue, Result Policy - Blocked, if in agent settings set Treat context as sensitive from the start of chat to true , raw tool call result gets into the model-facing LLM request, even though Result Policy = Blocked.
In code, in guardrails/trusted-data.ts line 68, this method returns early by this flag and skips real Tool Result Policies evaluation -
// If agent configured to consider context untrusted from the beginning,
// mark context as untrusted immediately and skip evaluation
if (considerContextUntrusted) {
logger.debug(
{ agentId },
"[trustedData] evaluateIfContextIsTrusted: context marked untrusted by agent config",
);
return {
toolResultUpdates: {},
contextIsTrusted: false,
usedDualLlm: false,
dualLlmAnalyses: [],
unsafeContextBoundary: {
kind: "preexisting_untrusted",
reason:
initialUntrustedReason ??
UNSAFE_CONTEXT_BOUNDARY_REASON.agentConfiguredUntrusted,
},
};
}
Without this flag, Result Policy reaches line 185 where result is replaced -
if (isBlocked) {
// Tool result is blocked - replace with blocked message
toolResultUpdates[toolCallId] =
`[Content blocked by policy${reason ? : ${reason} : ""}];`
toolResultIsTrusted = false;
J(
joey (archestra team)4:29 PMOpen in Slack
I saw that you opened a PR, will take a look tomorrow 🙏
👍1
II
Ildar Iskhakov (archestra team)5:26 PMOpen in Slack
thank you Vadim!
❤️1