Debugging AI Agent Runs with Failure Artifacts
Learn how to use visual failure artifacts to debug your AI agents when traces alone aren't enough.
Overview
Failure artifacts provide cinematic playback of exactly what went wrong when your AI agent encounters assertion failures. Unlike static screenshots, artifacts show the complete sequence of events leading to failure, making it easier to understand complex timing issues, visual bugs, and unexpected page behavior.
What Are Failure Artifacts?
Failure Artifacts are visual evidence automatically captured when your AI agent encounters assertion failures. They provide cinematic playback of exactly what went wrong, helping you debug complex issues that static screenshots can't capture.
Available Artifacts
- Video Clips (MP4) - Smooth playback of the failure sequence (when ffmpeg is available)
- Frame Gallery (JPEG/PNG) - Individual screenshots from the failure buffer
- Metadata Files - JSON files with step details, timestamps, and failure context
When Artifacts Are Created
- Only on assertion failures (not every failed step)
- Captures the 15 seconds leading up to the failure (configurable)
- Automatically uploaded to cloud storage and linked to your trace
Accessing Artifacts
Finding Artifacts in Predicate Studio
On the Trace List Page
Failed trace runs show an artifact indicator when visual evidence is available:
- Artifact Badge: 🎬 icon indicates artifacts are available
- Hover Details: Shows what's captured ("Video + 15 Frames")
In the Trace Debugger
Once you open a failed trace, artifacts appear in multiple places:
- Header Badge: Shows artifact count in the run header
- Timeline Markers: Failed steps have camera icons
- Detail Panel Tab: "ARTIFACTS" tab appears when available
Opening the Artifacts Panel
Method 1: From the Header
- Look for the artifact badge in the run header
- Click it to open the artifacts viewer
Method 2: From Failed Steps
- On the timeline, find steps marked with red camera icons
- Click the camera icon to jump to artifacts for that failure
Method 3: From Detail Panel
- Open any failed step's detail panel
- Click the "ARTIFACTS" tab
Understanding the Artifacts Interface
When you open artifacts, you'll see a two-panel layout designed for efficient debugging:
Left Panel: Visual Evidence
The main area shows your visual debugging tools:
Video Player (when available):
- HTML5 video player with custom controls
- Speed control (0.25x to 4x) for detailed analysis
- Timeline scrubbing and frame-by-frame stepping
- Download button for offline viewing
Frame Gallery (fallback or alternative view):
- Thumbnail grid of all captured frames
- Click any frame for full-screen viewing
- Batch download option for all frames
Right Panel: Failure Context
The metadata panel provides essential context:
Failure Summary:
- Why the failure occurred (assertion message)
- When the failure happened (timestamp)
- Technical details (buffer duration, frame count)
Step Timeline:
- Chronological breakdown of actions before failure
- Color-coded success/failure indicators
- Links back to the main trace debugger
Step-by-Step Debugging with Artifacts
Step 1: Identify Failed Runs with Artifacts
Start by finding traces that have visual evidence available:
- Look for the artifact badge (🎬) on failed trace cards
- Filter by status to focus on failed runs
- Check recent runs for the most relevant failures
Step 2: Quick Video Assessment
Get the big picture with cinematic playback:
- Open the failed trace in the debugger
- Click the artifact badge in the header
- Play the video at normal speed (1x) first
- Note the exact moment when things go wrong
Pro Tip: Use 0.5x speed for complex sequences where timing matters.
Step 3: Detailed Frame Analysis
Dive deep with pixel-perfect inspection:
- Switch to frame gallery for precise analysis
- Enable diff overlays to see what changed
- Zoom in on error messages or unexpected elements
- Compare multiple frames to understand the sequence
When to use frames over video:
- Need pixel-perfect comparison
- Large video files are slow to load
- Want to focus on specific moments
- Documentation or bug reporting
Step 4: Correlate with Trace Data
Connect visual evidence with agent behavior:
- Use timeline synchronization to link video to trace steps
- Check the LLM tab - what was the agent "thinking" at failure?
- Review verification signals - which assertions failed?
- Compare with action details - did execution match expectations?
Step 5: Identify Root Causes
Use the combined evidence to find the real issue:
Common Failure Patterns
Pattern 1: Wrong Element Selection
Video Evidence: Agent clicks a different button than expected
Trace Correlation: LLM chose element #42, but user wanted #38
Solution: Improve element descriptions or selectors
Pattern 2: Timing Issues
Video Evidence: Action fires before page finishes loading
Trace Correlation: Duration shows 50ms execution, but page needed 2s
Solution: Add proper wait conditions or delays
Pattern 3: Form Validation Problems
Video Evidence: User fills form, clicks submit, error appears
Trace Correlation: Verification expected success message
Solution: Update assertion logic for validation states
Pattern 4: Dynamic Content Changes
Video Evidence: Page layout shifts during interaction
Trace Correlation: Element becomes hidden after scroll
Solution: Use stable selectors or wait for stability
Advanced Artifact Analysis
Video Playback Techniques
Speed Control Strategies:
- 0.25x - 0.5x: For analyzing rapid sequences or animations
- 1x: Normal review of the failure flow
- 2x - 4x: Fast-forward through uneventful periods
Timeline Navigation:
- Scrub through time: Drag the progress bar to jump to key moments
- Frame stepping: Use arrow keys for precise frame-by-frame analysis
- Loop playback: Enable loop for studying repetitive behaviors
Frame Gallery Workflows
Comparative Analysis:
- Open two frames side-by-side in different browser tabs
- Use diff mode to highlight pixel-level changes
- Zoom to 200%+ for detailed element inspection
Pattern Recognition:
- Scan through frames chronologically looking for anomalies
- Note timestamps of important state changes
- Create frame sequences for documentation
Metadata Correlation
Step Timeline Analysis:
- Color coding: Green = success, Yellow = warning, Red = failure
- Duration patterns: Identify slow or fast operations
- URL transitions: Track navigation and page changes
Failure Context:
- Assertion details: What exactly was expected vs. what happened
- Buffer settings: How much history was captured
- Technical metadata: Frame rates, compression, file sizes
Real-World Debugging Scenarios
Scenario 1: Form Submission Failure
Problem: Agent fills out a form and clicks submit, but the form shows validation errors instead of proceeding.
Artifact Analysis:
- Watch the video: See the exact sequence of typing and clicking
- Check form fields: Verify all required fields were filled
- Look for validation timing: Did errors appear immediately or after submission?
- Compare with trace: Check if verification expected the right success indicators
Common Causes:
- Required fields missed by the agent
- Validation happens on blur, not submit
- JavaScript errors prevent submission
- Network issues during form processing
Scenario 2: Navigation Timing Issues
Problem: Agent clicks a link, but the page doesn't navigate as expected.
Artifact Analysis:
- Slow down playback: Use 0.5x to see the click and response
- Check for loading states: Look for spinners or "Loading..." text
- Watch URL changes: See if navigation starts but gets interrupted
- Compare with verification: Check if URL change detection worked
Common Causes:
- Single-page app navigation (URL doesn't change)
- Modal dialogs blocking navigation
- Network timeouts or slow responses
- JavaScript preventing default link behavior
Scenario 3: Element Interaction Problems
Problem: Agent tries to click a button, but the interaction doesn't work.
Artifact Analysis:
- Zoom in on the target element: Verify it's actually clickable
- Check element state: Is it disabled, hidden, or covered?
- Watch for page changes: Does the element move or change before click?
- Compare bounding boxes: Did the agent target the right area?
Common Causes:
- Element not visible or clickable
- Dynamic content loading changing the DOM
- Overlapping elements blocking interaction
- Animation or transition interfering with timing
Best Practices for Artifact Debugging
Workflow Optimization
- Start with video overview - Get the complete story before diving into details
- Use appropriate playback speed - Match the analysis needs
- Take notes on key moments - Document important timestamps
- Correlate with trace data - Don't analyze artifacts in isolation
Team Collaboration
- Share specific video segments - Link to exact failure moments
- Export key frames - Include in bug reports and documentation
- Document findings - Combine artifact evidence with trace analysis
- Create reproducible test cases - Use artifacts to understand failure conditions
Performance Considerations
- Preload artifacts - Let them load in the background while you work
- Use frame gallery for quick checks - Faster than loading full videos
- Close unused tabs - Free up memory for large artifact sets
- Download for offline analysis - Work with local copies for intensive debugging
Troubleshooting Artifacts
Common Issues
"No artifacts available for this run"
- Artifacts only capture on assertion failures, not action failures
- Verify your agent uses verification functions (
assert_done(), etc.) - Check that failure artifacts are enabled in SDK configuration
"Video won't play or loads slowly"
- Large video files may take time on slow connections
- Try refreshing the page or clearing browser cache
- Fall back to frame gallery for faster analysis
- Download video locally for better performance
"Frames don't show the failure moment"
- Artifacts capture the 15 seconds before failure, not after
- The failure moment itself may not be visually apparent
- Check the metadata panel for exact failure timing
- Use trace correlation to understand what happened at failure
"Can't find the ARTIFACTS tab"
- Tab only appears when artifacts are available for that run
- Make sure you're looking at a failed run with the artifact badge
- Try refreshing the page if the tab doesn't appear
Performance Tips
- Close other browser tabs when working with large artifact sets
- Use frame gallery instead of video for quick checks
- Download artifacts locally for intensive analysis sessions
- Clear browser cache if videos consistently fail to load
Next Steps
- Learn the Basics: Return to Predicate Studio Overview →
- Trace Analysis: Check out Detail Panel →
- Enable Artifacts: Set up failure artifact capture →
- Need Help?: Visit Troubleshooting →