Async API (Python SDK)
NEW in v0.90.17: Complete async API implementation with async versions of all SDK functions including core utilities, supporting utilities, agent layer, and developer tools. All async functions are now organized in their respective modules, with async_api serving as a convenient re-export point.
Why Use Async API?
The async API enables you to build high-performance automation with Python's asyncio framework:
- Concurrent operations: Run multiple browser tasks in parallel
- Better performance: Non-blocking I/O for faster automation
- Framework integration: Works seamlessly with FastAPI, aiohttp, asyncio
- Modern Python: Leverage async/await syntax
- No breaking changes: Coexists with sync API - use what fits your needs
AsyncPredicateBrowser
The AsyncPredicateBrowser class provides async context manager support and all the features of the sync PredicateBrowser.
Basic Usage
from predicate.async_api import AsyncPredicateBrowser
# Async context manager (recommended)
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Browser automatically closes when done
# Manual lifecycle
async def manual():
browser = AsyncPredicateBrowser()
await browser.start()
await browser.goto("https://example.com")
await browser.close()With API Key
from predicate.async_api import AsyncPredicateBrowser
async def main():
async with AsyncPredicateBrowser(api_key="sk_...") as browser:
await browser.goto("https://example.com")
# Use Pro/Enterprise featuresCustom Viewport
from predicate.async_api import AsyncPredicateBrowser
async def main():
# Custom viewport size
async with AsyncPredicateBrowser(
viewport={"width": 1920, "height": 1080}
) as browser:
await browser.goto("https://example.com")From Existing Playwright Context
from predicate.async_api import AsyncPredicateBrowser
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
# Create Playwright context
context = await p.chromium.launch_persistent_context(
"./user_data",
headless=False
)
# Convert to AsyncPredicateBrowser
browser = AsyncPredicateBrowser.from_existing(context)
# Use all Predicate features
await browser.page.goto("https://example.com")From Existing Page
from predicate.async_api import AsyncPredicateBrowser
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
context = await browser.new_context()
page = await context.new_page()
# Navigate first
await page.goto("https://example.com")
# Convert to AsyncPredicateBrowser
sentience_browser = AsyncPredicateBrowser.from_page(page)Async Functions
All core SDK functions have async versions with _async suffix for clarity.
snapshot_async()
Capture page snapshot asynchronously:
from predicate.async_api import AsyncPredicateBrowser, snapshot_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Capture snapshot
snap = await snapshot_async(browser)
print(f"Found {len(snap.elements)} elements")
print(f"Page title: {snap.title}")Parameters:
browser(AsyncPredicateBrowser): Browser instancescreenshot(bool, optional): Include screenshot. Defaults toTruelimit(int, optional): Max elements to returngoal(str, optional): ML reranking goal
click_async()
Click an element asynchronously:
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, click_async, find
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Find and click
snap = await snapshot_async(browser)
button = find(snap, "role=button text~'Submit'")
if button:
await click_async(browser, button.id)Parameters:
browser(AsyncPredicateBrowser): Browser instanceelement_id(str): Element ID from snapshot
type_text_async()
Type text into an input field asynchronously:
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, type_text_async, find
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
snap = await snapshot_async(browser)
email_input = find(snap, "role=textbox text~'email'")
if email_input:
await type_text_async(browser, email_input.id, "user@example.com")Parameters:
browser(AsyncPredicateBrowser): Browser instanceelement_id(str): Element ID from snapshottext(str): Text to type
press_async()
Press keyboard keys asynchronously:
from predicate.async_api import AsyncPredicateBrowser, press_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Press Enter
await press_async(browser, "Enter")
# Press Escape
await press_async(browser, "Escape")
# Keyboard shortcut
await press_async(browser, "Control+A")Parameters:
browser(AsyncPredicateBrowser): Browser instancekey(str): Key name (e.g., "Enter", "Escape", "Control+A")
click_rect_async()
Click at specific coordinates asynchronously:
from predicate.async_api import AsyncPredicateBrowser, click_rect_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Click at coordinates
await click_rect_async(
browser,
x=100,
y=200,
width=50,
height=30
)Parameters:
browser(AsyncPredicateBrowser): Browser instancex(int): X coordinatey(int): Y coordinatewidth(int): Click area widthheight(int): Click area height
Phase 2A: Core Utilities
NEW in v0.90.17: Async versions of core utility functions for semantic waiting, screenshots, and text search.
wait_for_async()
Wait for an element to appear using semantic queries:
from predicate.async_api import AsyncPredicateBrowser, wait_for_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Wait for element with timeout
result = await wait_for_async(browser, "role=button", timeout=5.0)
if result:
print(f"Element found: {result.id}")Parameters:
browser(AsyncPredicateBrowser): Browser instancequery(str): Semantic query stringtimeout(float, optional): Maximum wait time in seconds. Defaults to10.0
screenshot_async()
Capture screenshot asynchronously in PNG or JPEG format:
from predicate.async_api import AsyncPredicateBrowser, screenshot_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Capture screenshot as JPEG
data_url = await screenshot_async(browser, format="jpeg", quality=80)
# Save to file
import base64
image_data = base64.b64decode(data_url.split(',')[1])
with open("screenshot.jpg", "wb") as f:
f.write(image_data)Parameters:
browser(AsyncPredicateBrowser): Browser instanceformat(str, optional): Image format - "png" or "jpeg". Defaults to"png"quality(int, optional): JPEG quality (1-100). Only used when format is "jpeg". Defaults to90
find_text_rect_async()
Find text on the page and return pixel coordinates:
from predicate.async_api import AsyncPredicateBrowser, find_text_rect_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Find text and get coordinates
text_result = await find_text_rect_async(browser, "Sign In")
if text_result:
print(f"Text found at: x={text_result.x}, y={text_result.y}")
print(f"Size: {text_result.width}x{text_result.height}")Parameters:
browser(AsyncPredicateBrowser): Browser instancetext(str): Text to search for
Returns: Object with x, y, width, height properties, or None if not found
Phase 2B: Supporting Utilities
NEW in v0.90.17: Async versions of supporting functions for content reading, visual overlays, and assertions.
read_async()
Read page content in various formats:
from predicate.async_api import AsyncPredicateBrowser, read_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Read as markdown
markdown = await read_async(browser, output_format="markdown")
print(markdown)
# Read as plain text
text = await read_async(browser, output_format="text")
# Read raw HTML
html = await read_async(browser, output_format="html")Parameters:
browser(AsyncPredicateBrowser): Browser instanceoutput_format(str, optional): Output format - "html", "text", or "markdown". Defaults to"text"
show_overlay_async() / clear_overlay_async()
Manage visual overlays for debugging:
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, show_overlay_async, clear_overlay_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Take snapshot
snap = await snapshot_async(browser)
# Show overlay on specific element
await show_overlay_async(browser, snap, target_element_id=42)
# Clear overlay
await clear_overlay_async(browser)Parameters:
browser(AsyncPredicateBrowser): Browser instancesnapshot(Snapshot): Snapshot object fromsnapshot_async()target_element_id(int): Element ID to highlight
expect_async() / ExpectationAsync
Async assertion helpers with fluent API:
from predicate.async_api import AsyncPredicateBrowser, expect_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Assert element is visible
element = await expect_async(browser, "role=button").to_be_visible()
# Assert element exists
await expect_async(browser, "role=link").to_exist()
# Assert element contains text
await expect_async(browser, "role=heading").to_have_text("Welcome")
# Assert query returns N elements
await expect_async(browser, "role=link").to_have_count(5)Available Methods:
.to_be_visible()- Assert element is visible.to_exist()- Assert element exists.to_have_text(text)- Assert element contains text.to_have_count(count)- Assert query returns N elements
Phase 2C: Agent Layer
NEW in v0.90.17: Full async implementation of the agent layer for natural language automation.
PredicateAgentAsync
Async agent with observe-think-act loop:
from predicate.async_api import AsyncPredicateBrowser, PredicateAgentAsync
from predicate.llm_provider import OpenAIProvider
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Initialize LLM provider
llm = OpenAIProvider(api_key="your_key", model="gpt-4o")
# Create async agent
agent = PredicateAgentAsync(browser, llm)
# Natural language automation
result = await agent.act("Click the login button")
result = await agent.act("Type 'user@example.com' into the email field")
# Get token usage statistics
stats = agent.get_token_stats()
print(f"Tokens used: {stats['total_tokens']}")Features:
- Natural language automation with LLM
- Token usage tracking
- Observe-think-act loop
- Full async/await support
Parameters:
browser(AsyncPredicateBrowser): Browser instancellm_provider: LLM provider instance (OpenAIProvider, etc.)
Phase 2D: Developer Tools
NEW in v0.90.17: Async versions of developer tools for recording and inspection.
RecorderAsync / record_async()
Record actions and generate traces:
from predicate.async_api import AsyncPredicateBrowser, RecorderAsync
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Record actions
async with RecorderAsync(browser, capture_snapshots=True) as recorder:
await recorder.record_click(element_id)
await recorder.record_type(element_id, "text")
# Save trace
recorder.save("trace.json")Parameters:
browser(AsyncPredicateBrowser): Browser instancecapture_snapshots(bool, optional): Whether to capture snapshots. Defaults toTrue
InspectorAsync / inspect_async()
Inspect elements and debug interactively:
from predicate.async_api import AsyncPredicateBrowser, InspectorAsync
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Interactive inspection
async with InspectorAsync(browser) as inspector:
# Hover elements to see info in console
# Click elements to see full details
passParameters:
browser(AsyncPredicateBrowser): Browser instance
Pure Functions (No Async Needed)
These functions are pure (no I/O) and don't need async versions:
from predicate.async_api import find, query
# find() - Returns single element
button = find(snap, "role=button text~'Submit'")
# query() - Returns list of elements
links = query(snap, "role=link")Complete Example
Here's a full example combining all async functions:
import asyncio
from predicate.async_api import (
AsyncPredicateBrowser,
snapshot_async,
find,
click_async,
type_text_async,
press_async
)
async def login_example():
"""Complete login automation example"""
async with AsyncPredicateBrowser() as browser:
# Navigate to login page
await browser.goto("https://example.com/login")
# Take snapshot
snap = await snapshot_async(browser)
# Find email input
email_input = find(snap, "role=textbox text~'email'")
if email_input:
await type_text_async(browser, email_input.id, "user@example.com")
# Find password input
snap = await snapshot_async(browser)
password_input = find(snap, "role=textbox text~'password'")
if password_input:
await type_text_async(browser, password_input.id, "mypassword")
# Click submit button
snap = await snapshot_async(browser)
submit_btn = find(snap, "role=button text~'log in'")
if submit_btn:
await click_async(browser, submit_btn.id)
# Wait for page load
await asyncio.sleep(2)
# Verify login success
snap = await snapshot_async(browser)
print(f"Page title after login: {snap.title}")
# Run the async function
if __name__ == "__main__":
asyncio.run(login_example())Complete Phase 2A-2D Example
NEW in v0.90.17: Here's a comprehensive example using all the new async features:
from predicate.async_api import (
AsyncPredicateBrowser,
wait_for_async,
screenshot_async,
find_text_rect_async,
read_async,
show_overlay_async,
expect_async,
PredicateAgentAsync
)
from predicate.llm_provider import OpenAIProvider
async def comprehensive_example():
"""Example using all Phase 2A-2D features"""
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
# Phase 2A: Core Utilities
# Wait for element
result = await wait_for_async(browser, "role=button", timeout=5.0)
# Capture screenshot
data_url = await screenshot_async(browser, format="jpeg", quality=80)
# Find text on page
text_result = await find_text_rect_async(browser, "Sign In")
# Phase 2B: Supporting Utilities
# Read page content
markdown = await read_async(browser, output_format="markdown")
# Show visual overlay
from predicate.async_api import snapshot_async
snap = await snapshot_async(browser)
await show_overlay_async(browser, snap, target_element_id=42)
# Assertions
element = await expect_async(browser, "role=button").to_be_visible()
await expect_async(browser, "role=link").to_have_count(5)
# Phase 2C: Agent Layer
llm = OpenAIProvider(api_key="your_key", model="gpt-4o")
agent = PredicateAgentAsync(browser, llm)
# Natural language automation
result = await agent.act("Click the login button")
result = await agent.act("Type 'user@example.com' into the email field")
# Token tracking
stats = agent.get_token_stats()
print(f"Tokens used: {stats['total_tokens']}")
if __name__ == "__main__":
import asyncio
asyncio.run(comprehensive_example())Concurrent Operations
Run multiple browser tasks in parallel:
import asyncio
from predicate.async_api import AsyncPredicateBrowser, snapshot_async
async def scrape_page(url: str):
"""Scrape a single page"""
async with AsyncPredicateBrowser() as browser:
await browser.goto(url)
snap = await snapshot_async(browser)
return {
"url": url,
"title": snap.title,
"element_count": len(snap.elements)
}
async def scrape_multiple_pages():
"""Scrape multiple pages concurrently"""
urls = [
"https://example.com",
"https://example.org",
"https://example.net"
]
# Run all scrapes concurrently
tasks = [scrape_page(url) for url in urls]
results = await asyncio.gather(*tasks)
for result in results:
print(f"{result['url']}: {result['title']} ({result['element_count']} elements)")
# Run
if __name__ == "__main__":
asyncio.run(scrape_multiple_pages())Integration with Async Frameworks
FastAPI
from fastapi import FastAPI
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, find
app = FastAPI()
@app.get("/scrape")
async def scrape_endpoint(url: str):
"""API endpoint that scrapes a URL"""
async with AsyncPredicateBrowser() as browser:
await browser.goto(url)
snap = await snapshot_async(browser)
return {
"url": url,
"title": snap.title,
"element_count": len(snap.elements)
}aiohttp
import aiohttp
from aiohttp import web
from predicate.async_api import AsyncPredicateBrowser, snapshot_async
async def handle_scrape(request):
"""Handle scrape request"""
url = request.query.get('url')
async with AsyncPredicateBrowser() as browser:
await browser.goto(url)
snap = await snapshot_async(browser)
return web.json_response({
"url": url,
"title": snap.title,
"element_count": len(snap.elements)
})
app = web.Application()
app.router.add_get('/scrape', handle_scrape)
if __name__ == "__main__":
web.run_app(app)API Organization
v0.90.17 Refactoring:
- All async functions are now organized in their respective modules alongside sync versions:
AsyncPredicateBrowser→browser.pysnapshot_async()→snapshot.pyclick_async(),type_text_async(),press_async(),click_rect_async()→actions.pywait_for_async(),screenshot_async(),find_text_rect_async()→wait.py,screenshot.py,find_text_rect.pyread_async(),show_overlay_async(),expect_async()→read.py,overlay.py,expect.pyPredicateAgentAsync→agent.pyRecorderAsync,InspectorAsync→recorder.py,inspector.py
async_api.pyserves as a convenient re-export module - all async APIs available from a single import point- Full backward compatibility - existing imports continue to work
- Better code organization - async functions co-located with sync versions
Benefits
Complete Coverage:
- ✅ All sync functions now have async counterparts
- ✅ Core utilities, supporting utilities, agent layer, and developer tools
- ✅ Single import point from
sentience.async_api
Performance:
- Run multiple browser instances concurrently
- Non-blocking I/O for faster automation
- Better resource utilization
Code Quality:
- Modern async/await syntax
- Compatible with asyncio ecosystem
- Type hints and IDE support
- Better code organization with async functions in their respective modules
Compatibility:
- Works with FastAPI, aiohttp, asyncio
- No breaking changes to sync API
- Same API design as sync version
- Full backward compatibility maintained
Testing:
- Comprehensive test coverage (36+ async tests)
- All tests passing
- Production-ready
- 6 async examples in
sdk-python/examples/
Migration from Sync API
Migrating from sync to async is straightforward:
# Sync API (before)
from predicate import PredicateBrowser, snapshot, find, click
with PredicateBrowser() as browser:
browser.goto("https://example.com")
snap = snapshot(browser)
button = find(snap, "role=button")
if button:
click(browser, button.id)
# Async API (after)
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, find, click_async
async def main():
async with AsyncPredicateBrowser() as browser:
await browser.goto("https://example.com")
snap = await snapshot_async(browser)
button = find(snap, "role=button")
if button:
await click_async(browser, button.id)Key changes:
- Import from
sentience.async_apiinstead ofsentience - Use
AsyncPredicateBrowserinstead ofPredicateBrowser - Add
awaitbefore I/O operations (goto, snapshot_async, click_async, etc.) - Add
asynckeyword to function definition - Pure functions (find, query) don't need async
Next Steps
- Snapshot API - Learn about snapshot capture options
- Action API - Explore all available actions
- Query API - Master semantic element queries
- Browser Setup - Configure browser settings