Snapshot API
The snapshot() function captures the current rendered page state and returns a ranked, token-bounded set of interactive elements, plus runtime signals you can use for Jest-style verification: layout/ordinality (including dominant_group_key), modal/overlay detection (modal_detected, modal_grids), and diagnostics (stability/confidence, reason codes, and best-effort CAPTCHA / “requires vision” signals).
Basic Usage
from predicate import snapshot, SnapshotOptions, SnapshotFilter
# Basic snapshot (uses default options)
snap = snapshot(Important Notes
Credit Consumption:
- When
api_keyis provided, this calls the server-side/v1/snapshotendpoint which consumes 1 credit per call (metered billing). - Use
use_api=Falsefor local processing (no credits; ranking is best-effort/local-only).
Gateway timeout (API mode):
- When
use_api=true, the SDK sendsraw_elementsto the Gateway (POST /v1/snapshot) and waits for a refined response. - By default, the SDK uses a 30s timeout for this Gateway round trip.
- If you see client-side timeouts on large/heavy pages (e.g.
ReadTimeout), increase the Gateway timeout:- Python:
SnapshotOptions.gateway_timeout_s(seconds) - TypeScript:
SnapshotOptions.gatewayTimeoutMs(milliseconds)
- Python:
Payload Size Limit:
- The snapshot payload sent to the server is capped at 10MB to ensure reliable API performance.
- If your page has more elements than fit in 10MB, use the
limitoption to reduce the number of elements, or useuse_api=Falsefor local processing (no size limit).
Screenshots:
- If you pass
screenshot: true, the screenshot is captured locally by the extension. - Even in
use_api=truemode, the SDK does not receive screenshots from the server; it merges the server-rankedelementswith the locally captured screenshot.
Parameters
Python:
browser(PredicateBrowser): Browser instanceoptions(SnapshotOptions, optional): Snapshot configuration options
SnapshotOptions fields:
screenshot(bool | ScreenshotConfig, optional): Capture screenshot.Truefor PNG, or{"format": "jpeg", "quality": 80}. Default:False.limit(int, optional): Maximum number of elements to return. Default: 50. Range: 1-500 (SDK). In API mode, the server caps this value (default cap: 100).filter(SnapshotFilter | dict, optional): Filter options:min_area: Minimum element area in pixelsallowed_roles: List of roles to include (e.g.,["button", "link"])min_z_index: Minimum z-index value
use_api(bool, optional): Force server API (True) or local extension (False). Auto-detects ifNone.gateway_timeout_s(float, optional): Gateway snapshot timeout in seconds (only relevant whenuse_api=true). Default:30.show_overlay(bool, optional): Display visual overlay in browser highlighting detected elements. Default:False.goal(str, optional): Optional goal/task description for ML reranking.
TypeScript:
browser(PredicateBrowser): Browser instanceoptions(object, optional):screenshot(boolean | object): Capture screenshotlimit(number): Maximum elements to returnfilter(object): Filter optionsuse_api(boolean): Force server API or local extensiongatewayTimeoutMs(number): Gateway snapshot timeout in milliseconds (only relevant whenuse_api=true). Default:30000.show_overlay(boolean): Display visual overlay (default: false)goal(string, optional): Optional goal/task description for ML reranking
Example: increase Gateway timeout
from predicate import snapshot, SnapshotOptions
# Large pages can take longer to refine server-side.
snap = snapshot(
browser,
SnapshotOptions(use_api=True, gateway_timeout_s=60),
)Returns
Snapshot object with:
elements: List ofElementobjects (sorted by importance)url: Current page URLviewport: Viewport dimensionstimestamp: Snapshot timestampscreenshot: Base64-encoded image (if requested)dominant_group_key: Geometric group key for the main content area (may benull)diagnostics: Stability/debug diagnostics (may benull)modal_detected: True if a modal/overlay grid was detected (may benull)modal_grids: Detected modal grids (may benull)ml_rerank: ML reranking metadata (may benull)
Diagnostics (snapshot.diagnostics)
diagnostics is best-effort runtime evidence about page stability and “how trustworthy the snapshot is right now”.
Use it to:
- decide whether to retry (
.eventually()/ bounded retries) - explain failures (add the reason codes to artifacts/logs)
- detect non-DOM blockers (CAPTCHA signals)
- decide whether to escalate to a different executor when structure is insufficient
Diagnostics Fields
| Field | Type | How to use it |
|---|---|---|
confidence | number | null | A 0..1 stability score. Low confidence typically means the page is still moving (navigation, hydration, modals, DOM churn). Use it as a signal to retry snapshots before acting. |
reasons | string[] | Machine-readable reason codes explaining low confidence. Log these and include them in artifacts—this is often the fastest way to debug flaky runs. |
metrics | object | null | Best-effort browser-side metrics used to compute confidence. Useful for diagnosing “why was this unstable?” and for telemetry dashboards. |
captcha | object | null | Detection-only CAPTCHA signal (no solving). Use it to branch to your CAPTCHA handling strategy or fail fast with a clear reason. |
requires_vision | boolean | null | Best-effort recommendation that structure may be insufficient for this page state (e.g., heavy canvas / non-semantic UI). Use it as an escalation signal. |
requires_vision_reason | string | null | Human-readable explanation for why structure is likely insufficient. Include it in traces/artifacts to make failures explainable. |
Diagnostics Metrics (diagnostics.metrics)
| Metric | Meaning |
|---|---|
ready_state | Document readyState (e.g., "loading", "interactive", "complete"). |
quiet_ms | How long the page has been “quiet” (no major DOM churn), in milliseconds (best-effort). |
node_count | Approximate DOM node count (best-effort). Useful for “page exploded” diagnostics. |
interactive_count | How many interactive candidates were detected (best-effort). |
raw_elements_count | How many raw elements were captured before filtering (best-effort). |
CAPTCHA Diagnostics (diagnostics.captcha)
CAPTCHA diagnostics are detection-only signals:
| Field | Meaning |
|---|---|
detected | True if a CAPTCHA-like pattern was detected. |
provider_hint | Best-effort provider hint (may be null). |
confidence | 0..1 confidence of detection. |
evidence | Best-effort evidence hits (text/selector/iframe/url) to make detections explainable. |
Element Properties
Each element in snapshot.elements has the following properties:
Core Properties
| Property | Type | Description |
|---|---|---|
id | int | Unique identifier for clicking/interacting |
role | str | Semantic role (button, link, textbox, heading, etc.) |
text | str | None | Visible text content |
importance | int | AI importance score (0-1000, higher = more important) |
bbox | BBox | Bounding box with x, y, width, height |
visual_cues | VisualCues | Visual analysis (is_primary, is_clickable, background_color_name) |
in_viewport | bool | Whether element is visible in current viewport |
is_occluded | bool | Whether element is covered by another element |
z_index | int | CSS z-index value (default: 0) |
ML Reranking Properties (Optional)
These fields are present when goal is provided in SnapshotOptions:
| Property | Type | Description |
|---|---|---|
fused_rank_index | int | None | 0-based rank after sorting by importance_fused |
heuristic_index | int | None | 0-based rank before ML reranking (original heuristic position) |
ml_probability | float | None | Confidence score from ONNX model (0.0 - 1.0) |
ml_score | float | None | Raw logit score from ONNX model (for debugging) |
Ordinal / Layout Properties (Optional)
These fields support position-based selection ("first result", "top item"):
| Property | Type | Description |
|---|---|---|
center_x | float | None | X coordinate of element center (viewport coords) |
center_y | float | None | Y coordinate of element center (viewport coords) |
doc_y | float | None | Y coordinate in document (center_y + scroll_y) |
group_key | str | None | Geometric bucket key for ordinal grouping |
group_index | int | None | Position within group (0-indexed, sorted by doc_y) |
in_dominant_group | bool | None | Whether element is in the dominant group (main content area) |
State-Aware Assertion Properties (Optional)
These fields enable Jest-style assertions for form controls:
| Property | Type | Description |
|---|---|---|
name | str | None | Accessible name/label for controls (distinct from visible text) |
value | str | None | Current value for inputs/textarea/select (may be redacted for PII) |
input_type | str | None | Input type (e.g., "text", "email", "password") |
value_redacted | bool | None | Whether value was redacted for privacy (password/email/tel) |
checked | bool | None | Normalized checked state for checkboxes/radios |
disabled | bool | None | Normalized disabled state |
expanded | bool | None | Normalized expanded state for dropdowns/accordions |
aria_checked | str | None | Raw ARIA checked string (tri-state: "true"/"false"/"mixed") |
aria_disabled | str | None | Raw ARIA disabled string |
aria_expanded | str | None | Raw ARIA expanded string |
Additional Properties (Optional)
| Property | Type | Description |
|---|---|---|
href | str | None | Hyperlink URL (for link elements) |
nearby_text | str | None | Nearby static text (best-effort, usually for top-ranked elements) |
diff_status | str | None | Diff status: "ADDED", "REMOVED", "MODIFIED", "MOVED" (for diff overlay) |
Visual Overlay Feature
When show_overlay=True, Predicate displays a visual overlay in the browser highlighting all detected elements:
Color Coding:
- Red: Target element (when specified in agent actions)
- Blue: Primary elements (
is_primary=true) - Green: Regular interactive elements
Visual Indicators:
- Border thickness and opacity scale with
importancescore - Semi-transparent fill for better visibility
- Importance badges showing scores
- Star icon for primary elements
- Target emoji for the target element
- Auto-clear: Overlay automatically disappears after 5 seconds
Use Cases:
- Debugging: Visualize what elements Predicate detects on the page
- Learning: Understand how importance scoring works
- Validation: Verify that critical buttons/links are being detected
- Analysis: See which elements rank highest for your use case
# Example: Debug why a button isn't being clicked
from predicate import SnapshotOptions
browser.goto("https://example.com")
snap = snapshot(browser, SnapshotOptions(show_overlay=True)) # See what's detected
time.sleep(6)ML Reranking (Goal-Based Optimization)
When you provide a goal parameter in SnapshotOptions, the server uses an ONNX-based machine learning model to rerank elements based on relevance to your goal. This dramatically improves element selection accuracy for agent tasks.
ML Rerank Metadata (snapshot.ml_rerank)
When ML reranking is enabled, snapshot.ml_rerank provides best-effort metadata about what happened in the server-side rerank pass.
| Field | Type | Meaning |
|---|---|---|
enabled | boolean | Whether ML reranking was enabled for this snapshot. |
applied | boolean | Whether reranking actually ran (may be false if conditions were not met). |
reason | string | null | Why reranking was applied or skipped (best-effort). |
candidate_count | number | How many elements were considered for reranking. |
top_probability | number | null | Confidence of the top-ranked element (0..1). |
min_confidence | number | null | Confidence threshold used (if any). |
is_high_confidence | boolean | null | True if top probability meets the high-confidence threshold. |
tags | string[] | Internal labels for debugging and analysis. |
error | string | null | Error message if reranking failed (best-effort). |
# Trigger ML reranking by providing a goal
snap = snapshot(browser, SnapshotOptions(
goal="Click the login button",
limit=50
))
# Elements are now sorted by ML relevance, not just heuristic importance
for element in snap.elements[:5When ML fields are present:
- When
goalis provided inSnapshotOptions - When using agent methods like
agent.act()(goals are passed automatically) - When
goalis not specified (elements ranked by heuristic importance only)
What the fields mean:
fused_rank_index: Final position after ML + heuristic fusion (0 = most relevant to goal)heuristic_index: Original position before ML (shows how much ML changed the ranking)ml_probability: Model's confidence that this element is relevant (0.0-1.0)ml_score: Raw logit score before softmax (useful for debugging model behavior)