Docs/SDK/Ordinality & Layout

Ordinality & Layout Support

Semantic Geometry is the foundational feature of Predicate SDK that helps AI agents perceive and interact with web pages. Ordinality and layout support extend this by enabling agents to understand element positions when users specify goals like "click the first result", "select the 2nd item", or "choose the first product with rating >= 4".

Overview

When users give AI agents positional instructions, the agent needs to know:

Which elements belong together (e.g., all search results vs navigation links)
What order they're in (1st, 2nd, 3rd, last)
Where they are on the page (row/column position in a grid)

Predicate SDK solves this with two complementary features:

Ordinality - Assigns position indices to elements within groups ("1st", "2nd", "last")
Layout Detection - Identifies grids, lists, and page regions (header, nav, main, footer)

Why This Matters for LLMs

Layout detection is the critical missing link that allows an LLM to reliably answer "list + constraint" queries like "Find the first product with rating >= 4".

Without layout detection, an LLM sees a flat soup of text nodes and has to "guess" which rating belongs to which product based on proximity. With layout support, that soup transforms into structured objects, making the task trivial.

The Association Problem

The biggest challenge for an LLM on a web page is knowing boundaries between items.

Without Layout: The LLM sees a text sequence like:

...[Product A]... [Price]... [3.5 Stars]... [Product B]...

It might incorrectly assume "3.5 stars" belongs to Product B if the DOM structure is messy.

With Layout: You can present the LLM with structured objects:

{
  "grid_id": 101,
  "label": "Product Card",
  "children": [
    { "text": "Wireless Headphones", "role": "title" },
    { "text": "4.0", "role": "rating" }
  ]
}

The rating is explicitly linked to its parent product - no guessing required.

The Ordering Problem

"First" is ambiguous in a responsive grid.

Without Layout: DOM order often differs from visual order (e.g., in masonry layouts or flex-direction columns). The "first" product in the DOM might actually be in the top-right corner visually.

With Layout: The grid detection algorithm sorts items by visual rows first, then columns. This guarantees that "first" means "top-left," matching human intuition.

Summary of Benefits

Feature	Benefit for "List + Constraints"
Dominant Group	Filters out noise (nav, footer) so the LLM only checks the relevant list
Container Inference	Solves the "Association Problem" - knowing which price/rating belongs to which item
Grid Sorting	Solves the "Ordering Problem" - correctly identifying the "first" item visually

How It Works

Dominant Group Detection

The SDK automatically identifies the main content group on a page. This is typically the primary list or grid that users want to interact with (search results, product listings, article feeds).

Each element gets a group_key that indicates which visual group it belongs to. The most common group is marked as the dominant_group_key in the snapshot.

from predicate import PredicateBrowser, snapshot

with PredicateBrowser() as browser:
    browser.page.goto("https://news.ycombinator.com")
    snap = snapshot(browser)

    # The dominant group is the main content area
    print

Ordinal Selection

Each element in a group has a group_index (0-based position). This enables selecting elements by ordinal position:

# Get elements sorted by position in the dominant group
dominant_elements = sorted(
    [e for e in snap.elements if e.in_dominant_group],
    key=lambda e: e.group_index or 0
)

Element Position Fields

Each element includes position data for ordinal selection:

Field	Type	Description
`center_x`	`number`	X coordinate of element center (viewport-relative)
`center_y`	`number`	Y coordinate of element center (viewport-relative)
`doc_y`	`number`	Absolute Y position in document (includes scroll offset)
`group_key`	`string`	Geometric bucket key for grouping (format: `x{bucket}-h{bucket}`)
`group_index`	`number`	Position within group (0-indexed, sorted by doc_y)
`in_dominant_group`	`boolean`	Whether element is in the main content group
`href`	`string`	Hyperlink URL (for link elements)

Layout Detection

Layout detection provides detailed grid and region information for complex page structures.

Layout Fields on Elements

Elements may include a layout field with geometric metadata:

Field	Type	Description
`grid_id`	`number`	Unique ID for the grid this element belongs to
`grid_pos`	`GridPosition`	Row and column indices (0-based)
`parent_index`	`number`	Index of inferred parent element in the elements array
`children_indices`	`number[]`	List of child element indices (capped at 30)
`region`	`string`	Page region: `header`, `nav`, `main`, `aside`, or `footer`
`grid_confidence`	`number`	Confidence score for grid assignment (0.0-1.0)

Grid Coordinates API

Get bounding boxes and metadata for detected grids:

from predicate import PredicateBrowser, snapshot

with PredicateBrowser() as browser:
    browser.page.goto("https://example.com/products")
    snap = snapshot(browser)

    # Get all detected grids
    all_grids =

GridInfo Properties

Property	Type	Description
`grid_id`	`number`	Unique identifier for the grid
`bbox`	`BBox`	Bounding box (x, y, width, height) in document coordinates
`row_count`	`number`	Number of rows in the grid
`col_count`	`number`	Number of columns in the grid
`item_count`	`number`	Total number of items in the grid
`label`	`string \| null`	Inferred semantic label (see below)
`is_dominant`	`boolean`	Whether this is the main content grid

Grid Labels

The SDK automatically infers grid labels based on content patterns:

Label	Detected When
`product_grid`	Price patterns ($, €, £), "Add to cart", ratings
`search_results`	Snippets, ellipses, mostly links
`article_feed`	Timestamps ("2 hours ago"), bylines, dates
`navigation`	Short text, homogeneous links, nav keywords
`button_grid`	All elements are buttons
`link_list`	80%+ of elements are links

Working with Grid Positions

Access individual element positions within a grid:

# Access element layout data
for elem in snap.elements:
    if elem.layout and elem.layout.grid_id is not None:
        print(f"Element '{elem.

Practical Examples

Example 1: Click the First Search Result

from predicate import PredicateBrowser, snapshot, click

with PredicateBrowser() as browser:
    browser.page.goto("https://google.com")
    # ... perform search ...

    snap = snapshot(browser)

Example 2: Select Product in Grid by Row/Column

# Find product at row 1, column 2 (0-indexed)
target_row, target_col = 1, 2

for elem in snap.elements:
    if elem.layout and elem.layout.grid_pos:
        pos = elem.

Example 3: Filter by Region

# Get only elements in the main content area
main_elements = [
    e for e in snap.elements
    if e.layout and e.layout.region == "main"
]

# Get navigation links
nav_links = [

Important Notes

The layout field is optional and may not be present in all snapshots
Grid labels are best-effort heuristics and may not always be accurate
children_indices is capped at 30 elements to prevent large payloads
Confidence scores (grid_confidence, region_confidence) indicate detection reliability

Visual Overlay

Snapshot API