# Advanced Agent Verifier

## ⚡*<mark style="color:purple;">Responsible</mark>*. We *<mark style="color:purple;">Care</mark>*.

{% embed url="<https://climate.stripe.com/6SV9BA>" %}

{% content-ref url="testing" %}
[testing](https://docs.lisaiceland.com/platform+/active-development/testing)
{% endcontent-ref %}

{% content-ref url="../../smarter-ai-learn-more/ai-safety+/bias-protections" %}
[bias-protections](https://docs.lisaiceland.com/smarter-ai-learn-more/ai-safety+/bias-protections)
{% endcontent-ref %}

{% content-ref url="../../smarter-ai-learn-more/ai-safety+/guardrails+/ai-safety-guardrails" %}
[ai-safety-guardrails](https://docs.lisaiceland.com/smarter-ai-learn-more/ai-safety+/guardrails+/ai-safety-guardrails)
{% endcontent-ref %}

{% content-ref url="human-in-the-loop" %}
[human-in-the-loop](https://docs.lisaiceland.com/platform+/active-development/human-in-the-loop)
{% endcontent-ref %}

![Security](https://img.shields.io/badge/agent--verifier-implemented-blue)

Last updated: March 24, 2026.

[![Tests](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e)](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e) [![Vitest](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138)](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138) [![Languages](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565)](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565) [![BYOK Providers](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765)](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765) [![Security](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)

![Moat](https://docs.lisaiceland.com/~gitbook/image?url=https%3A%2F%2Fcamo.githubusercontent.com%2Fcf9284fb15978bad5057ded6dd214f81c456978e71e1a84894a7c1c0203e94db%2F68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f6d70657469746976652532306d6f61742d3530253230646966666572656e746961746f72732d707572706c65\&width=300\&dpr=3\&quality=100\&sign=f6cd6274\&sv=2) [![Shipped](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)

***

## AI Voice+

### Overview

The Agent Verifier is a conceptual security framework that ensures AI agents operating within the AI Voice+ platform are trustworthy, sandboxed, and auditable. This document maps the 18 verifier concepts to our actual implementation.

***

### Implementation Status

#### ✅ Already Implemented

| Verifier Concept                  | Our Implementation                                                                                                | Code Location                                    |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
| **Agent Identity & Provenance**   | Org-scoped `ai_agents` table with unique IDs, API key hashing (SHA-256) for MCP                                   | `ai_agents` table, `mcp-server/index.ts`         |
| **Capability Declaration**        | `agent_one_tools` table declares per-workspace tools with type and definition                                     | `agent_one_tools` table                          |
| **Prompt & Policy Compliance**    | 10-pattern injection scanning, expanded safety preamble with bias/fairness/boundaries                             | `agent-one-chat/index.ts`, `convo-chat/index.ts` |
| **Tool & API Sandboxing**         | Per-request context isolation, org-scoped queries, daily quotas (100 msgs/day)                                    | `mcp-server/index.ts` (RequestContext class)     |
| **DLP (Data Leakage Prevention)** | 5-pattern PII redaction on input AND output (CC, SSN, email, phone, UK NINO)                                      | All chat edge functions                          |
| **Audit Logs**                    | `ai_usage_logs` table tracks moderation blocks, injection detections, and usage                                   | `ai_usage_logs` table                            |
| **Human-in-the-Loop**             | Content moderation blocks with safety refusals; fail-closed moderation                                            | `moderateContent()` in chat functions            |
| **Self-Restricting Behavior**     | SAFETY\_PREAMBLE includes: "ask for clarification rather than guessing", "may decline tasks outside capabilities" | System prompts                                   |
| **Rate Limiting**                 | IP-based (30/15min) + org-based daily quotas; fail-closed rate limiter                                            | `_shared/rate-limit.ts`                          |

#### 🔮 Planned (Future Roadmap)

| Verifier Concept                   | Status  | Notes                                                              |
| ---------------------------------- | ------- | ------------------------------------------------------------------ |
| **Multi-Agent Cross-Verification** | Planned | Requires multi-model voting system; would use agent transfer rules |
| **Agent Reputation/Trust Scores**  | Planned | Needs historical behavior data collection over time                |
| **Behavioral Drift Detection**     | Planned | Requires baseline behavior collection and comparison               |
| **Certification Badges**           | Planned | UI feature showing agent compliance status                         |
| **Version Control & Rollback**     | Planned | Agent configuration versioning with rollback capability            |
| **Automated Red-Teaming**          | Planned | Periodic injection testing against live agents                     |

***

### Verification Architecture

```
User Request
    │
    ▼
┌─────────────────┐
│  Rate Limiter    │  ← IP-based, fail-closed
│  (Layer 1)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Input Validator │  ← Length, type, UUID format
│  (Layer 2)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Sanitizer       │  ← Control char stripping
│  (Layer 3)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Injection Scan  │  ← 10 regex patterns, safe-wrapping
│  (Layer 4)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  PII Redactor    │  ← 5 PII patterns on input
│  (Layer 5)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Content Mod     │  ← AI gateway, fail-closed
│  (Layer 6)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Auth & Org      │  ← JWT verification, org membership
│  (Layer 7)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Daily Quota     │  ← Per-org message limits
│  (Layer 8)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  AI Model Call   │  ← Safety preamble + context
│  (Layer 9)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Output PII      │  ← PII redaction on response
│  Redaction       │
│  (Layer 10)      │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Output Mod      │  ← Content moderation on response
│  (Layer 11)      │
└────────┬────────┘
         │
    ▼
  Response to User
```

***

### MCP Server Security

The MCP (Model Context Protocol) server uses a `RequestContext` class instead of global state to prevent cross-tenant data leaks during concurrent requests:

* Each request authenticates via SHA-256 hashed API key
* Context (Supabase client, org ID, user ID) is stored per-request
* All tool handlers read from the request-scoped context
* Org-scoped queries prevent data access across tenants

***

### How Existing Safety Layers Map to Verifier Concepts

| Safety Layer                                  | Verifier Concept                     |
| --------------------------------------------- | ------------------------------------ |
| `INJECTION_PATTERNS` (10 patterns)            | Prompt & Policy Compliance           |
| `PII_PATTERNS` (5 patterns)                   | DLP / Data Leakage Prevention        |
| `SAFETY_PREAMBLE` (bias, boundaries, honesty) | Policy Compliance + Self-Restriction |
| `moderateContent()` (fail-closed)             | Human-in-the-Loop (automated)        |
| `RequestContext` class                        | Tool Sandboxing                      |
| `ai_usage_logs` audit entries                 | Audit Logs                           |
| `checkRateLimit()` (fail-closed)              | Rate Limiting                        |
| `BLOCKED_VOICE_PHRASES`                       | DLP for Voice                        |
| Error masking (generic messages)              | Information Disclosure Prevention    |
| `encrypt_sensitive()` / `decrypt_sensitive()` | Data Protection at Rest              |

***

### 🚀 What's Next? (see Roadmap)

{% embed url="<https://future.lisaiceland.com/roadmap>" %}
