Test every AI agent.Then watch them in production.
Offensive scanning + live runtime monitoring for LLMs and AI agents. 58,000+ attack techniques, 30+ live detectors, autonomous SOC triage.
HOW IT WORKS
Three steps to full coverage
From configuration to actionable results in minutes.
CONFIGURE TARGET
Pick your scan mode — Browser for web apps, API for endpoints, Agent for autonomous systems, Model for direct LLM testing. ShieldPi fingerprints defenses and plans the attack strategy.
> ready
RUN ATTACK ENGINE
58,000+ techniques across 15 categories. Adaptive strategy selection, multi-turn conversation chains, authority escalation, encoding bypasses. Big Brain reasons about every response.
> ready
GET YOUR REPORT
ExploitDepth L1–L4 scoring. Compliance mappings across 9 frameworks. Remediation steps, kill-chain breakdowns, reproducer payloads. Exportable as PDF, CSV, JSON, Markdown.
> ready
SCAN MODES
Four ways to scan
Pick the mode that matches your stack — we handle the rest.
Browser
Test web application interfaces via Playwright. Real browser rendering, DOM interaction, session handling.
- Headless browser automation
- Multi-turn conversation chains
- Screenshot evidence capture
- Cookie & session handling
API
Scan REST/GraphQL endpoints directly. Supports OpenAI, Anthropic, Gemini, or custom formats.
- OpenAI / Anthropic / Gemini / Custom
- Bearer token & API key auth
- Configurable model parameters
- Rate-limit-aware testing
Agent
Test autonomous agents via GET/POST webhooks. Multi-turn attacks, tool abuse, plan injection, memory drift.
- Tool & function call injection
- Multi-turn conversation attacks
- Memory drift exploitation
- Plan injection & goal substitution
Model
Scan any LLM via OpenRouter. Compare security posture across 20+ models. Public leaderboard rankings.
- 20+ models via OpenRouter
- Standardized scoring rubric
- Public leaderboard ranking
- Full 58,000+ attack suite
ATTACK ENGINE
58,000+ techniques across 15 categories
Adaptive strategy selection. Real exploitation, not just benchmarks.
▸ TOP CATEGORIES
Top Attack Categories
▸ EXPLOITATION STRATEGIES
Adaptive Strategies
LIVE LEADERBOARD
The Living LLM Security Leaderboard
Real vulnerability scores for every major foundation model — updated weekly from the harvester pipeline.
| Rank | Model | Score | Grade | Vulns | Status |
|---|---|---|---|---|---|
| #1 | Claude Opus 4.5 | 93 | A | 12 | HARDENED |
| #2 | Claude Sonnet 4.6 | 91 | A- | 18 | HARDENED |
| #3 | GPT-4o | 85 | B+ | 31 | MODERATE |
| #4 | Gemini 1.5 Pro | 82 | B | 44 | MODERATE |
| #5 | Mistral Large | 77 | B- | 58 | EXPOSED |
UPDATED 4M AGO · WEEKLY SYNC
VIEW FULL LEADERBOARD →COMPLIANCE
Mapped to every framework your auditors care about
Every finding ships with mapped compliance identifiers out of the box.
CI/CD INTEGRATION
Ship it into your pipeline
One API call. Every commit. Fail the build when your AI regresses.
# .github/workflows/shieldpi-scan.ymlname: LLM Security Scanon: [push, pull_request] jobs: scan: runs-on: ubuntu-latest steps: - name: Kick off ShieldPi scan run: | SCAN=$(curl -sS -X POST \ -H 'X-API-Key: ${{ secrets.SHIELDPI_KEY }}' \ -H 'Content-Type: application/json' \ -d '{"target_id":"${{ secrets.TARGET_ID }}","fail_threshold":"high"}' \ https://api.shieldpi.io/api/ci/scan | jq -r .scan_id) - name: Pull report + guardrails bundle run: | curl -H 'X-API-Key: ${{ secrets.SHIELDPI_KEY }}' \ -o bundle.zip \ https: //api.shieldpi.io/api/scans/$SCAN/bundleGET STARTED
Your AI has vulnerabilities.We find them before attackers do.
One scan. Five minutes. See exactly how your model fails.
Start your first scanOPEN RESEARCH PROJECT · 58,000+ ATTACK TECHNIQUES · RESULTS IN MINUTES