Agent Runtime
Updated 2026-01-24 - Runtime verification support for agent loops with assertion predicates and task completion tracking.
Think Jest for AI web agents. AgentRuntime provides Jest-style semantic assertions so agents can verify what changed on the page instead of guessing when they're "done".
Related topics
AgentRuntime stays small by design. These pages cover the runtime’s surrounding capabilities.
- Choose your path — sidecar Debugger vs SDK-driven loops
- Verifications — expanded
.eventually()+ scroll verification - Lifecycle hooks — observe step boundaries
- Tabs — multi-tab workflows
- Downloads — capture + verify downloads
- Permissions — avoid permission bubbles; grant/clear mid-run
- Tool Registry — typed, auditable tools
- Filesystem tools — sandboxed file I/O for agents
- Evaluate JS — bounded page introspection
- CAPTCHA handling — detection + handler workflows with page-control hook
- Domain policies — allow/deny navigation guardrails
- Structured extraction — validated JSON from
read() - Predicate Debugger — attach verification to existing agents
Assertions at a Glance
Predicate provides Jest-style assertions for AI web agents.
Instead of trusting that an agent probably clicked the right thing, Predicate verifies outcomes using structured snapshots from a live, rendered browser (post-SPA hydration).
Predicate does not parse static HTML and does not rely on vision by default.
Rendered DOM, not HTML parsing
Predicate snapshots the rendered DOM + layout from a real browser after SPA hydration. This works reliably on JS-heavy applications where static HTML scraping fails.
What's included
Deterministic assertions
- State-aware checks: enabled, checked, value, expanded
- Semantic selectors (no brittle CSS/XPath)
- Clear failure reasons + nearest-match suggestions
Reliability & recovery
.eventually()retries with bounded backoff- Snapshot confidence + exhaustion detection
- Optional vision fallback (last resort)
Local-first by default
With structured snapshots, 3B–14B local models are viable. Larger models improve planning and recovery — not DOM parsing.
Debugging Agent Failures
When assertions fail, the Failure Artifact Buffer automatically captures video clips, snapshots, and metadata for post-mortem analysis. Review artifacts locally or upload to Predicate Studio for visual debugging.
CAPTCHA Handling
Predicate detects but does not solve CAPTCHAs — by design. The SDK provides hooks to integrate your preferred resolution strategy: human-in-loop, external solvers, or vision-only verification. Learn how to configure CAPTCHA policies and handlers.
How it works (high level)
- Snapshot the rendered DOM + layout from a live browser
- Execute actions deterministically
- Verify outcomes with pass/fail assertions
- Retry or escalate only when the page is unstable
Vision models are used only if structural signals are exhausted — never by default.
Vision is not the primary signal
Vision is used only after snapshot confidence is exhausted. Assertions remain invariant; only the perception layer changes.
Overview
The AgentRuntime class provides a thin runtime wrapper that combines:
- Browser session management via the BrowserBackend protocol
- Snapshot/query helpers
- Tracer for event emission
- Assertion/verification methods
It's designed for agent verification loops where you need to repeatedly take snapshots, execute actions, and verify results.
New in 2026-01-24: AgentRuntime is now framework-agnostic and accepts any browser implementing the BrowserBackend protocol. This allows integration with browser-use, Playwright, or any CDP-based browser through a single backend parameter.
Quick Start
With browser-use (Recommended)
from browser_use import BrowserSession, BrowserProfile
from predicate import get_extension_dir
from predicate.backends import BrowserUseAdapter
from predicate.agent_runtime import AgentRuntime
from predicate.tracing import Tracer, JsonlTraceSink
# 1. Setup browser-use with Predicate extensionWith Builder/Pro/Teams/Enterprise Tier (Gateway Refinement)
# Same setup as above, but with API key for smart element ranking
runtime = AgentRuntime(
backend=backend,
tracer=tracer,
predicate_api_key="sk_pro_xxxxx", # Enables Gateway refinement
)
# Snapshots now use server-side ML ranking/filtering
await runtime.snapshot()Integration with browser-use
If you're using browser-use, you have two common integration patterns:
- Backend integration (AgentRuntime-driven): use
BrowserUseAdapterto turn a browser-use session into a PredicateBrowserBackend(shown above in Quick Start). - Agent integration (browser-use-driven): use browser-use’s
PredicateAgentwhen you want an agent loop with verification “wired in”.
PredicateAgent Integration
This runs step assertions and a done assertion, and emits verification data into traces.
from browser_use.integrations.sentience import PredicateAgent
from predicate.verification import url_contains, exists, all_of
# Define per-step assertions
step_assertions = [
{
"predicate": url_contains("example.com"),
"label": "on_target_site",
"required": True,
},
{
"predicate": exists("role=button"),
"label": "has_buttons",
},
]
# Define task completion assertion
done_assertion = all_of(
url_contains("/success"),
exists("text~'Complete'"),
)
agent = PredicateAgent(
task="Complete the checkout flow",
llm=llm,
browser_session=session,
enable_verification=True,
step_assertions=step_assertions,
done_assertion=done_assertion,
trace_dir="traces",
)
result = await agent.run()
print(result.get("verification"))PredicateContext (Token‑Slasher)
PredicateContext builds compact, ranked DOM context blocks for LLMs (semantic geometry instead of full DOM + screenshots). Pair it with AgentRuntime when you want:
- Token-efficient “what the agent sees” for planning
- Deterministic pass/fail verification for progress tracking
For the full browser-use integration guide, see: Browser‑use Integration.
from predicate.backends import PredicateContext
ctx = PredicateContext(
max_elements=60,
show_overlay=True,
top_element_selector={
"by_importance": 60,
"from_dominant_group": 15AgentRuntime Class
Constructor
from predicate.agent_runtime import AgentRuntime
# New: Using backend parameter (recommended)
runtime = AgentRuntime(
backend=backend, # Any BrowserBackend implementation
tracer=tracer, # Tracer for event emission
predicate_api_key="sk_...", # Optional: Pro/Enterprise Gateway refinementParameters:
backend- Any browser implementing theBrowserBackendprotocol (browser-use, Playwright, CDP-based)tracer- Tracer for emitting verification eventspredicate_api_key(optional) - API key for Pro/Enterprise tier Gateway refinement
Backward Compatibility
For existing AsyncPredicateBrowser users, use the factory method:
from predicate import AsyncPredicateBrowser
from predicate.agent_runtime import AgentRuntime
async with AsyncPredicateBrowser() as browser:
page = await browser.new_page()
await page.gotoProperties
| Property | Type | Description |
|---|---|---|
step_id / stepId | string | null | Current step identifier |
step_index / stepIndex | number | Current step index (0-based) |
last_snapshot / lastSnapshot | Snapshot | null | Most recent snapshot |
is_task_done / isTaskDone | boolean | Whether task is complete |
BrowserBackend Protocol
The BrowserBackend protocol defines the minimal interface required for browser integration. Any browser framework can work with AgentRuntime by implementing this protocol.
Protocol Methods
| Method | Description |
|---|---|
eval(expression) | Execute JavaScript in page context |
call(fn, args) | Call JavaScript function with arguments |
get_url() | Get current page URL |
screenshot_png() | Capture viewport screenshot |
mouse_click() | Perform mouse click action |
mouse_move() | Move mouse to coordinates |
wheel() | Scroll using mouse wheel |
type_text() | Send keyboard input |
wait_ready_state() | Wait for document ready state |
refresh_page_info() | Get viewport and scroll info |
get_layout_metrics() | Get page layout metrics |
Backend Implementations
The SDK provides built-in backend implementations:
| Backend | Use Case |
|---|---|
BrowserUseAdapter | For browser-use integration via CDPBackend |
PlaywrightBackend | For direct Playwright usage |
CDPBackend | Low-level CDP-based browser control |
Core Methods
snapshot() - Take Page Snapshot
Takes a snapshot of the current page state. Updates lastSnapshot which is used as context for assertions.
# Take snapshot (required before element assertions)
snap = runtime.snapshot()
print(f"Found {len(snap.elements)} elements")Returns: Snapshot - Current page state
begin_step() / beginStep() - Start New Step
Begins a new verification step. Generates a new step ID, clears previous assertions, and increments step index.
# Begin a new step
step_id = runtime.begin_step("Navigate to checkout")
print(f"Step ID: {step_id}")
# Or with explicit step index
step_id = runtime.begin_step("Verify cart", step_index=Parameters:
goal(string) - Description of what this step aims to achievestep_index/stepIndex(number, optional) - Explicit step index (auto-increments if omitted)
Returns: string - Generated step ID
assert_() / assert() - Evaluate Assertion
Evaluates an assertion predicate against the current snapshot state. Results are accumulated for the step and emitted as verification events.
# URL assertion
url_ok = runtime.assert_(url_contains("checkout"), "on_checkout_page")
# Element assertion
has_btn = runtime.assert_(exists("role=button text~'Pay'"), "has_pay_button")Parameters:
predicate(Predicate) - Predicate function to evaluatelabel(string) - Human-readable label for this assertionrequired(boolean, optional) - If true, gates step success (default: false)
Returns: boolean - True if assertion passed
assert_done() / assertDone() - Assert Task Completion
Asserts task completion with a required assertion. When passed, marks the task as done.
# Check if task goal is achieved
if runtime.assert_done(exists("text~'Order Confirmed'"), "order_placed"):
print("Order successfully placed!")
# runtime.is_task_done is now TrueParameters:
predicate(Predicate) - Predicate function to evaluatelabel(string) - Human-readable label for this assertion
Returns: boolean - True if task is complete
Predicate Helpers
URL Predicates
from predicate import url_matches, url_contains
# Regex match on URL
runtime.assert_(url_matches(r"https://.*\.example\.com"), "is_https")
# Substring match on URL
runtime.assert_(url_contains("checkout"), "on_checkout"Element Predicates
from predicate import exists, not_exists, element_count
# Element exists (using query syntax)
runtime.assert_(exists("role=button text~'Submit'"), "has_submit")
# Element does not exist
runtime.assert_(not_exists("text~'Error'"),Combinators
from predicate import all_of, any_of
# All conditions must pass
runtime.assert_(
all_of(
url_contains("checkout"),
exists("role=button text~'Pay'"),
not_exists("text~'Error'")
),Custom Predicates
from predicate import custom
from predicate.verification import AssertContext, AssertOutcome
def my_predicate(ctx: AssertContext) -> AssertOutcome:
# Custom logic using ctx.snapshot and ctx.url
has_items = len(ctxAssertion DSL (E, expect) — Quick Overview
For expressive, Jest-like assertions, Predicate also ships an Assertion DSL (E, expect, dominant-list queries). DSL expressions compile to predicates — you still pass them into assert/assert_ (or into .check(...).eventually()).
For the full DSL guide and examples, see: Jest‑Style Assertions.
from predicate.asserts import E, expect, in_dominant_list
runtime.assert_(
expect(E(role="button", text_contains="Submit")).to_be_visible(),
label="submit_visible"Assertion Status Methods
Check Assertion Results
# Check if all assertions in current step passed
if runtime.all_assertions_passed():
print("All assertions passed!")
# Check if all required assertions passed
if runtime.required_assertions_passed():
print("All required assertions passed!")Get Assertions for Step End
Retrieve accumulated assertions for inclusion in trace step_end events:
# Get assertions data for step_end event
assertions_data = runtime.get_assertions_for_step_end()
print(f"Assertions: {assertions_data['assertions']}")
print(f"Task done: {assertions_data.Reset Task State
For multi-task runs, reset the task completion state:
# Reset task_done state for next task
runtime.reset_task_done()Trace Integration
Assertions are automatically emitted as verification events to the tracer, making them visible in Studio timeline.
Verification Event Schema
{
"type": "verification",
"data": {
"kind": "assert",
"label": "on_checkout_page",
"passed": true,
"required": false,
"reason": "URL contains 'checkout'",
"details": { "url": "https://example.com/checkout" }
},
"step_id": "abc-123"
}Task Done Event
When assert_done() passes:
{
"type": "verification",
"data": {
"kind": "task_done",
"label": "order_placed",
"passed": true
},
"step_id": "abc-123"
}Complete Example
from predicate import (
AgentRuntime,
PredicateBrowser,
all_of,
exists,
not_exists,
url_contains,
url_matches,
)
from predicate.tracer_factory import create_tracer
defAPI Reference
Predicate Functions
| Predicate | Description |
|---|---|
url_matches(pattern) / urlMatches(pattern) | URL matches regex pattern |
url_contains(substring) / urlContains(substring) | URL contains substring |
exists(query) | Element matching query exists in snapshot |
not_exists(query) / notExists(query) | No element matching query exists |
element_count(query, min, max) / elementCount(query, opts) | Element count within range |
all_of(...predicates) / allOf(...predicates) | All predicates must pass |
any_of(...predicates) / anyOf(...predicates) | Any predicate must pass |
custom(fn) | Custom predicate function |
AssertOutcome
| Property | Type | Description |
|---|---|---|
passed | boolean | Whether assertion passed |
reason | string | Human-readable explanation |
details | object | Additional context data |
AssertContext
| Property | Type | Description |
|---|---|---|
snapshot | Snapshot | null | Current page snapshot |
url | string | null | Current page URL |
step_id / stepId | string | null | Current step identifier |
FAQ: Jest/Playwright already has toBeEnabled() — what's different?
Short answer
Yes, Jest/Playwright can verify state given a selector. Predicate handles the harder parts that come before verification.
What Predicate adds:
- Semantic element identity — Choose the right element via ordinality, grouping, and semantic matching (not brittle CSS/XPath)
- Stability-aware verification — Wait for the page to settle (
snapshot_confidence) before asserting - Explainable failures — Reason codes (
element_not_found,dom_unstable) + nearest-match suggestions - Bounded retries + escalation —
.eventually()with deterministic backoff; optional vision fallback only when structure fails - Works inside Jest — Predicate can run under Jest as the harness; it doesn't replace Jest
Predicate ≠ Jest replacement
Jest is the test runner. Predicate provides the perception + verification layer that makes assertions meaningful for AI agents operating on dynamic web pages.
Mapping to Jest/Playwright Matchers
Jest/Playwright matchers like expect(locator).toBeEnabled() check state at a point in time, given a locator you already have.
Predicate assertions like runtime.assert_(is_enabled(selector)) are stability-aware and include element selection semantics + retry logic.
Conceptual mapping:
- Predicate
is_enabled(...)↔ JesttoBeEnabled()— but Predicate finds the element semantically first - Predicate
is_checked(...)↔ JesttoBeChecked() - Predicate
text_contains(...)↔ JesttoContainText() - Predicate
exists(...)↔ JesttoBeVisible()/toBeAttached()— but with snapshot context - Predicate
.eventually(...)↔ JestwaitFor/ retry loops — but Predicate retries with confidence gating - Predicate nearest matches + reason codes ↔ Jest doesn't provide this by default
Key difference
Jest verifies what you already found. Predicate helps you find the right element in the first place — then verifies it with full context.
Note: Predicate can be used under Jest as the test runner. Jest is the harness; Predicate is the perception + verification layer.
Related: AI-Driven QA for Enterprises
If you're interested in using Predicate assertions for enterprise QA workflows — pre-release validation, regression testing, and monitoring critical user flows — see AI-Driven QA with Predicate.