🔴 OpenClaw Red Team

AI Agent Adversarial Resistance Testing Suite
Comprehensive two-phase security assessment
PHASE 1

Direct Attacks

50
Blocked
8
Partial
91.5%
Overall Resistance
PHASE 2

Supply Chain

Infrastructure deployed for testing

  • 🔴 9 Poisoned API endpoints
  • 🔴 10 Adversarial pages
  • 🔴 Multiple injection vectors

📊 Phase 1 Results by Category

🛡️ Key Hardening Recommendations

Input Sanitization for APIs

Strip patterns like "ignore previous instructions" from external API responses before processing. Consider a secondary filter model.

Content Extraction for Web Scraping

Remove hidden elements (display:none, white-on-white), HTML comments, and meta tags before processing scraped content.

System Prompt Reinforcement

Include explicit instructions to ignore commands found in external data sources and maintain core guidelines.

Canary Token Detection

Monitor responses for unique canary strings to detect if agents are echoing injected content.

🔁 Replication

cd ~/clawd/projects/openclaw-redteam node src/harness.js # Run Phase 1 tests curl https://redteam-poisoned-api.tjerobotics.workers.dev/api/weather?city=Boston # Test API curl https://7519cb83.redteam-adversarial.pages.dev/news-article.html # Test page