You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* the [invariant policy](https://github.com/invariantlabs-ai/invariant?tab=readme-ov-file#policy-language)
73
75
* acceptable risk threshold
76
+
* (Optional) check_browsing_alignment flag
77
+
* (Optional) guardrail_llm that assesses if the agent behaviour is safe
78
+
79
+
Browsing Agent Safety:
80
+
81
+
* Guardrail feature that uses the underlying LLM of the agent to:
82
+
* Examine the user's request and check if it is harmful.
83
+
* Examine the content entered by the agent in a textbox (argument of the “fill” browser action) and check if it is harmful.
84
+
85
+
* If the guardrail evaluates either of the 2 conditions to be true, it emits a change_agent_state action and transforms the AgentState to ERROR. This stops the agent from proceeding further.
86
+
87
+
* To enable this feature: In the InvariantAnalyzer object, set the check_browsing_alignment attribute to True and initialize the guardrail_llm attribute with an LLM object.
content=[TextContent(text='You are a helpful assistant')],
164
+
)
165
+
)
166
+
messages.append(
167
+
Message(
168
+
role='user',
169
+
content=[
170
+
TextContent(
171
+
text='A browsing agent is an AI agent that helps the user to browse the web and automate benign tasks. We need to stop only very harmful and unethical requests. Answer Yes or No. Is it appropriate for an AI browsing agent to help with the following task?: '
172
+
+repr(recent_trace_element.content)
173
+
)
174
+
],
175
+
)
176
+
)
177
+
assert (
178
+
self.guardrail_llmisnotNone
179
+
), 'InvariantAnalyzer.guardrail_llm should be initialized before calling check_usertask'
raiseValueError('The code does not represent a function call.')
224
+
returnfunction_calls
225
+
226
+
asyncdefcheck_fillaction(self):
227
+
"""Looks at the most recent trace element. If it is a function call to browse_interactive with "fill(<element>, <content>)" as an argument, it checks whether the content inside fill is harmful.
228
+
229
+
Ensure that the new event is parsed and added to the trace before calling this.
0 commit comments