Docs/API Reference/POST /v1/observe

POST /v1/observe Legacy

Main endpoint for web content extraction and element coordinate mapping. Routes requests to Reader or Geometry services based on mode.

Overview

Prefer the SDK: For AI agents, use the SDK Quick Start instead. The SDK provides better abstraction, action execution, and session management.

Base URL: https://api.sentienceapi.com/v1/observe

Authentication: Required - Bearer token in Authorization header

Request Parameters

Required Parameters

url (string) - The URL of the website to process. Must be a valid HTTP or HTTPS URL.

Optional Parameters

mode (string) - Specifies which service to use. Defaults to "read".

  • "read" - Extract clean Markdown content (1 credit)
  • "map" - Get element coordinates for AI agents (2 credits default, 10 with precision)
  • "visual" - Get screenshot with coordinates (10 credits - always uses Precision Engine)

format (string) - Output format for read mode. Defaults to "markdown".

  • "markdown" (default) - Clean Markdown with formatting preserved
  • "text" - Plain text extraction without Markdown syntax

options (object) - Additional options for fine-tuning request behavior:

  • render_quality (string) - Control rendering engine for map mode:

    • "performance" (default) - Fast rendering, 2 credits
    • "precision" - Pixel-perfect accuracy, 10 credits
  • contentLimit (number) - Maximum characters to return. Default: 50,000

    • 💡 Limit to 15,000 for shorter articles to reduce token costs
  • limit (number) - Maximum number of elements to return (for map and visual modes)

    • 💡 Limit to 100 elements on complex sites to reduce tokens by 95%
  • filter (object) - Filter elements before smart selection:

    • min_area (number) - Minimum element area (width × height). Hides small icons
    • allowed_tags (string[]) - Whitelist of HTML tags. Example: ["button", "input", "a"]
    • allowed_roles (string[]) - Whitelist of ARIA roles. Example: ["button", "textbox", "link"]
  • include_visual_cues (boolean) - Include visual styling hints (color names, cursor type, prominence). Adds ~17 tokens per element

  • screenshot_delivery (string) - How to deliver screenshots in visual mode:

    • "url" (default) - Pre-signed S3 URL (24h expiration)
    • "base64" - Inline Base64 data

Request Examples

Read Mode (Markdown Extraction)

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "mode": "read",
    "format": "markdown",
    "options": {
      "contentLimit": 50000
    }
  }'

Map Mode (Performance - Default)

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type": application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "mode": "map"
  }'

Map Mode (Precision - Pixel Perfect)

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type": application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "mode": "map",
    "options": {
      "render_quality": "precision"
    }
  }'

Smart Filtering (95% Token Savings)

Reduce token costs dramatically on complex sites like Amazon, Slickdeals, or Reddit by intelligently selecting only the most important UI elements:

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type": application/json" \
  -d '{
    "url": "https://www.amazon.com/s?k=laptop",
    "mode": "map",
    "options": {
      "limit": 100,
      "filter": {
        "min_area": 100,
        "allowed_roles": ["button", "textbox", "link"]
      }
    }
  }'

Result: Returns only the 100 most important interactive elements, reducing tokens from ~50,000 to ~2,500 (95% savings).

Visual Mode with Screenshot

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type": application/json" \
  -d '{
    "url": "https://example.com/login",
    "mode": "visual",
    "options": {
      "screenshot_delivery": "url",
      "limit": 50
    }
  }'

Response Format

Read Mode Response

{
  "status": "success",
  "url": "https://news.ycombinator.com",
  "content": "# Hacker News\n\nStories:\n- AI breakthrough in reasoning\n- New framework release\n...",
  "title": "Hacker News",
  "description": "Hacker News new | past | comments | ask | show | jobs | submit",
  "credits_used": 1
}

Map Mode Response

{
  "status": "success",
  "url": "https://example.com",
  "viewport": { "width": 1920, "height": 1080 },
  "interactable_elements": [
    {
      "id": 0,
      "role": "button",
      "text": "Sign in",
      "bbox": { "x": 100, "y": 200, "width": 150, "height": 40 },
      "in_viewport": true,
      "is_occluded": false,
      "importance": 85,
      "visual_cues": {
        "background_color_name": "blue",
        "cursor_type": "pointer",
        "is_prominent": true
      }
    }
  ],
  "credits_used": 2
}

Visual Mode Response

{
  "status": "success",
  "url": "https://example.com/login",
  "viewport": { "width": 1920, "height": 1080 },
  "screenshot": {
    "type": "url",
    "url": "https://sentience-screenshots.s3.amazonaws.com/abc123.png",
    "format": "png",
    "size_bytes": 158277,
    "expires_at": "2025-12-20T05:10:45Z"
  },
  "interactable_elements": [
    {
      "id": 0,
      "role": "textbox",
      "text": "",
      "bbox": { "x": 500, "y": 300, "width": 300, "height": 40 },
      "attributes": { "placeholder": "Email" },
      "in_viewport": true,
      "is_occluded": false
    }
  ],
  "credits_used": 10
}

Response Fields

Common Fields (All Modes)

  • status (string) - "success" or "error"
  • url (string) - The URL that was processed
  • credits_used (number) - Credits consumed by this request

Read Mode Fields

  • content (string) - Extracted Markdown or plain text
  • title (string) - Page title
  • description (string) - Page meta description

Map/Visual Mode Fields

  • viewport (object) - Viewport dimensions used (width, height)
  • interactable_elements (array) - Array of interactive elements with:
    • id (number) - Element identifier
    • role (string) - ARIA role (button, textbox, link, etc.)
    • text (string) - Visible text content
    • bbox (object) - Bounding box (x, y, width, height)
    • in_viewport (boolean) - Whether element is visible
    • is_occluded (boolean) - Whether element is hidden by others
    • importance (number) - Importance score (0-100)
    • visual_cues (object, optional) - Visual styling hints

Visual Mode Exclusive Fields

  • screenshot (object) - Screenshot information:
    • type (string) - "url" or "base64"
    • url (string) - Pre-signed S3 URL (if type="url")
    • data (string) - Base64 data (if type="base64")
    • format (string) - Image format (png, jpeg)
    • size_bytes (number) - File size
    • expires_at (string) - URL expiration timestamp

Common ARIA Roles

Elements are ranked by importance. Higher priority roles receive higher importance scores:

High Priority (+1000 pts) - Form inputs:

  • textbox, input, searchbox, combobox, spinbutton, slider

Medium Priority (+500 pts) - Interactive controls:

  • button, submit, link, checkbox, radio, switch, tab, menuitem

Low Priority (+100 pts) - Content navigation:

  • heading, list, listitem, article, section, navigation, menu

Informational (0 pts) - Static content:

  • img, figure, status, alert, dialog, banner, contentinfo

Best Practices

1. Use Smart Filtering on Complex Sites

For sites with hundreds of elements (Amazon, Reddit, news sites), use limit and filter:

{
  "options": {
    "limit": 100,
    "filter": {
      "min_area": 100,
      "allowed_roles": ["button", "textbox", "link"]
    }
  }
}

Result: 95% token reduction while keeping all important interactive elements.

2. Choose the Right Render Quality

  • Performance mode (default) - Use for standard websites, SPAs built with React/Vue
  • Precision mode - Use for complex SPAs, pixel-perfect layouts, or when performance mode misses elements

3. Limit Content Length

For read mode, set contentLimit based on your needs:

{
  "options": {
    "contentLimit": 15000  // ~3,750 tokens
  }
}

4. Use Visual Cues Selectively

Enable include_visual_cues only when you need color/styling hints for icon-heavy UIs. It adds ~17 tokens per element.

Error Responses

{
  "status": "error",
  "error": "Invalid URL",
  "message": "The provided URL is not valid or cannot be accessed"
}

Common errors:

  • Invalid URL - Malformed or inaccessible URL
  • Invalid mode - Mode must be "read", "map", or "visual"
  • Invalid render_quality - Must be "performance" or "precision"
  • Rate limit exceeded - Too many requests, slow down or upgrade
  • Insufficient credits - Add more credits to your account

Credit Costs

ModeEngineCredits
ReadPerformance1
MapPerformance2
MapPrecision10
VisualPrecision (always)10

Migration to SDK

This endpoint is considered legacy. For new projects, use the Predicate SDK:

Benefits:

  • ✅ Automatic snapshot collection
  • ✅ Built-in action execution (click, type, wait)
  • ✅ Session management and error handling
  • ✅ Tracing and debugging with Predicate Studio

Quick comparison:

# Legacy API approach
import requests

response = requests.post(
    "https://api.sentienceapi.com/v1/observe",
    headers={"Authorization": "Bearer sk_..."},
    json={"url": "https://example.com", "mode"

Next Steps