NEW · Live Agent Monitor — the SOC for AI Agents

AUTONOMOUS AI SECURITY PLATFORM

Test every AI agent.Then watch them in production.

Offensive scanning + live runtime monitoring for LLMs and AI agents. 58,000+ attack techniques, 30+ live detectors, autonomous SOC triage.

Start Scanning pip install shieldpiView Leaderboard

SCAN-a3f2-9b1c-44d0│MODE: MODEL│VIA: OPENROUTER— — — — — — — —

TARGETClaude Opus 4.6

ENGAGED

67%

PAIR.encoding_bypass.base64██████████L4 EXPLOITED0.94

jailbreak.DAN_v11.roleplay██████████L3 EXPLOITED0.89

prompt_injection.system_leak██████░░░░L2 PARTIAL0.65

adv_perturb.token_smuggling░░░░░░░░░░L1 BLOCKED0.12

crescendo.multiturn.context░░░░░░░░░░QUEUED—

ATTACK TECHNIQUES

Three steps to full coverage

From configuration to actionable results in minutes.

STEP 01

CONFIGURE TARGET

Pick your scan mode — Browser for web apps, API for endpoints, Agent for autonomous systems, Model for direct LLM testing. ShieldPi fingerprints defenses and plans the attack strategy.

> ready

STEP 02

RUN ATTACK ENGINE

58,000+ techniques across 15 categories. Adaptive strategy selection, multi-turn conversation chains, authority escalation, encoding bypasses. Big Brain reasons about every response.

> ready

STEP 03

GET YOUR REPORT

ExploitDepth L1–L4 scoring. Compliance mappings across 9 frameworks. Remediation steps, kill-chain breakdowns, reproducer payloads. Exportable as PDF, CSV, JSON, Markdown.

> ready

SCAN MODES

Four ways to scan

Pick the mode that matches your stack — we handle the rest.

ACTIVE

Browser

Test web application interfaces via Playwright. Real browser rendering, DOM interaction, session handling.

Headless browser automation
Multi-turn conversation chains
Screenshot evidence capture
Cookie & session handling

> playwrightweb UI testing

ACTIVE

API

Scan REST/GraphQL endpoints directly. Supports OpenAI, Anthropic, Gemini, or custom formats.

OpenAI / Anthropic / Gemini / Custom
Bearer token & API key auth
Configurable model parameters
Rate-limit-aware testing

> httpsendpoint testing

ACTIVE

Agent

Test autonomous agents via GET/POST webhooks. Multi-turn attacks, tool abuse, plan injection, memory drift.

Tool & function call injection
Multi-turn conversation attacks
Memory drift exploitation
Plan injection & goal substitution

> webhookmulti-turn testing

POPULAR

Model

Scan any LLM via OpenRouter. Compare security posture across 20+ models. Public leaderboard rankings.

20+ models via OpenRouter
Standardized scoring rubric
Public leaderboard ranking
Full 58,000+ attack suite

> openrouterfull 58,000+ suite

ATTACK ENGINE

58,000+ techniques across 15 categories

Adaptive strategy selection. Real exploitation, not just benchmarks.

▸ TOP CATEGORIES

Top Attack Categories

Prompt Injection

847 techniques

Jailbreak

634 techniques

Data Exfiltration

412 techniques

Evasion

318 techniques

Agent Exploitation

256 techniques

Supply Chain

148 techniques

▸ EXPLOITATION STRATEGIES

Adaptive Strategies

TAP

Tree of Attacks with Pruning

PAIR

Prompt Automatic Iterative Refinement

Crescendo

Gradual escalation chains

Skeleton Key

Meta-instruction override

Many-Shot

High-volume in-context attacks

Authority

5-layer credential escalation

Encoding

8 encoder library bypass

Context Manip

Policy Puppetry templates

Adaptive Strategy Selection

Defense Fingerprinting

ExploitDepth L1-L4 Scoring

9 Compliance Frameworks

LIVE LEADERBOARD

The Living LLM Security Leaderboard

Real vulnerability scores for every major foundation model — updated weekly from the harvester pipeline.

Rank	Model	Score	Grade	Vulns	Status
#1	Claude Opus 4.5	93	A	12	HARDENED
#2	Claude Sonnet 4.6	91	A-	18	HARDENED
#3	GPT-4o	85	B+	31	MODERATE
#4	Gemini 1.5 Pro	82	B	44	MODERATE
#5	Mistral Large	77	B-	58	EXPOSED

UPDATED 4M AGO · WEEKLY SYNC

VIEW FULL LEADERBOARD →

COMPLIANCE

Mapped to every framework your auditors care about

Every finding ships with mapped compliance identifiers out of the box.

OWASP

OWASP Top 10 LLM

App Security · ✓ MAPPED

ATLAS

MITRE ATLAS

Adversarial AI · ✓ MAPPED

NIST

NIST AI RMF

Risk Management · ✓ MAPPED

EU AI Act

Regulation · ✓ MAPPED

SOC2

SOC 2 AI Addendum

Controls · ✓ MAPPED

ISO

ISO/IEC 42001

AI Management · ✓ MAPPED

CWE

CWE Taxonomy

Weakness Enumeration · ✓ MAPPED

PCI

PCI DSS AI

Payment Systems · ✓ MAPPED

HIPAA

HIPAA AI

Healthcare · ✓ MAPPED

CI/CD INTEGRATION

Ship it into your pipeline

One API call. Every commit. Fail the build when your AI regresses.

shieldpi-scan.yml

● LIVE

# .github/workflows/shieldpi-scan.yml
name: LLM Security Scan
on: [push, pull_request]
 
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - name: Kick off ShieldPi scan
        run: |
          SCAN=$(curl -sS -X POST \
            -H 'X-API-Key: ${{ secrets.SHIELDPI_KEY }}' \
            -H 'Content-Type: application/json' \
            -d '{"target_id":"${{ secrets.TARGET_ID }}","fail_threshold":"high"}' \
            https://api.shieldpi.io/api/ci/scan | jq -r .scan_id)
 
      - name: Pull report + guardrails bundle
        run: |
          curl -H 'X-API-Key: ${{ secrets.SHIELDPI_KEY }}' \
            -o bundle.zip \
            https: //api.shieldpi.io/api/scans/$SCAN/bundle

GitHub ActionsCI/CD

SUPPORTED ▸

GitLab CIPIPELINES

SUPPORTED ▸

JenkinsCI SERVER

SUPPORTED ▸

GET STARTED

Your AI has vulnerabilities.We find them before attackers do.

One scan. Five minutes. See exactly how your model fails.

Start your first scan

OPEN RESEARCH PROJECT · 58,000+ ATTACK TECHNIQUES · RESULTS IN MINUTES