# Bias Protections

## ⚡We *<mark style="color:purple;">Care</mark>*. Period.

![Security](https://camo.githubusercontent.com/d30aadacd9235f0b103a241774e4b6f45050cadd89f3fd352c277f2c9ae21763/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f626961732d2d70726f74656374696f6e2d352d2d6c617965722d626c7565)

Last updated: March 24, 2026.

[![Tests](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e)](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e) [![Vitest](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138)](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138) [![Languages](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565)](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565) [![BYOK Providers](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765)](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765) [![Security](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)

![Moat](https://docs.lisaiceland.com/~gitbook/image?url=https%3A%2F%2Fcamo.githubusercontent.com%2Fcf9284fb15978bad5057ded6dd214f81c456978e71e1a84894a7c1c0203e94db%2F68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f6d70657469746976652532306d6f61742d3530253230646966666572656e746961746f72732d707572706c65\&width=300\&dpr=3\&quality=100\&sign=f6cd6274\&sv=2) [![Shipped](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)

***

## AI Voice+

### Overview

AI Voice+ implements a 5-layer bias protection stack to ensure equitable treatment across all AI-powered features. These protections are designed to be practical and performant — they run inline without degrading response times or user experience.

***

### 5-Layer Bias Protection Stack

#### Layer 1: Input Classifier

* Regex-based injection scanning detects attempts to override safety instructions
* PII redaction strips identifying information before it reaches the AI model
* Input sanitization removes control characters that could manipulate behavior

#### Layer 2: Policy-Constrained Execution

**Location**: `SAFETY_PREAMBLE` in each chat edge function

The system prompt includes explicit anti-bias instructions:

* "Treat all users equitably regardless of demographics"
* "Do not make assumptions based on names, accents, or other personal attributes"
* "If uncertain about a claim, state your uncertainty rather than guessing"

#### Layer 3: Output Verification

**Location**: Post-response moderation in `agent-one-chat` and `convo-chat`

* AI-powered content moderation scans all responses before delivery
* Flagged responses are replaced with safe fallback messages
* Moderation blocks are logged to `ai_usage_logs` for audit

#### Layer 4: Evidence Enforcement

**Location**: `SAFETY_PREAMBLE` professional boundaries section

* AI agents are instructed to recommend professional consultation for medical, legal, and financial questions
* Agents must indicate when information may be incomplete or uncertain
* No guarantees or promises on behalf of the business

#### Layer 5: Data Minimization

**Location**: PII redaction + system prompt instructions

* PII patterns (credit cards, SSNs, emails, phone numbers, UK NINOs) are redacted on input AND output
* System prompt instructs: "Do not echo back personal data shared by the user"
* Conversation history is limited to 20 messages to minimize data exposure

***

### Multi-Agent Bias Mitigations

For organizations using multiple AI agents (via Agent-to-Agent transfer rules):

* Each agent operates within org-scoped isolation (RLS policies)
* Transfer rules use keyword/intent matching, not demographic data
* Agent skills use a numeric proficiency scale (15-100), not subjective labels
* All agents share the same safety preamble and bias protection instructions

***

### What We Track

| Metric                   | Table                                                  | Purpose                   |
| ------------------------ | ------------------------------------------------------ | ------------------------- |
| Moderation blocks        | `ai_usage_logs` (feature: `*_moderation_block`)        | Track false positive rate |
| Injection detections     | `ai_usage_logs` (feature: `injection_detected`)        | Monitor attack patterns   |
| Output moderation blocks | `ai_usage_logs` (feature: `*_output_moderation_block`) | Track output safety       |

***

### Known Limitations

1. **Model-level biases**: We use third-party models (Gemini, GPT). We cannot retrain them to remove biases — we mitigate via prompt engineering and output filtering.
2. **Language coverage**: PII redaction patterns are optimized for English, US SSN, and UK NINO formats. Other national ID formats may not be caught.
3. **Cultural context**: The safety preamble is written in English. Non-English conversations may have reduced bias protection coverage.
4. **No demographic auditing**: We do not collect demographic data about users, so we cannot audit for disparate impact across groups.

***

### Future Improvements

* Multi-language PII pattern support
* Automated bias testing with synthetic personas
* Output fairness scoring via secondary model
* Configurable sensitivity levels per workspace
