Human-in-the-Loop
π’ Smarter AI π’
β‘We Care. Period.
Human-in-the-LoopAdvanced Agent VerifierAI Safety GuardrailsTestingAI Voice+
HITL Protections
Updated: 02-19-2026
What is Human-in-the-Loop?
Human-in-the-Loop (HITL) is an AI safety pattern where human judgement, consent, or oversight is required at critical points in an automated pipeline. Instead of letting AI operate with unchecked autonomy, HITL ensures that humans remain the final authority on sensitive decisions.
In AI Voice+, HITL does not mean a human reviews every single AI response (that would destroy the real-time experience). Instead, it means the system enforces human checkpoints at the moments that matter most: consent, identity, safety, and audit.
Our 6 HITL Mechanisms
1. Content Moderation Blocks (Fail-Closed)
Every AI chat function (agent-one-chat, convo-chat, chat-with-data) runs user input through moderate functions before the AI model ever sees it. If the moderation check flags the content or if the moderation service itself is unreachable, the request is blocked β not allowed through.
Code: in each edge function
Logging: Blocks are logged to
ai_usage_logswith feature tagsagent_one_moderation_block,convo_moderation_block,chat_data_moderation_blockHuman element: The user receives a clear safety notice and can rephrase their request
2. Safety Refusals via System Prompt
The SAFETY_PREAMBLE is injected into every AI conversation. It contains non-negotiable instructions that the model must follow, including:
No medical, legal, or financial advice β the AI will politely decline and suggest consulting a professional
Uncertainty disclosure β if the AI is not confident, it says so rather than guessing
Bias protection β equitable treatment regardless of caller demographics (see BiasProtections.md)
Code:
SAFETY_PREAMBLEconstant inagent-one-chat,convo-chatHuman element: The AI actively redirects users to human experts for sensitive topics
3. Call Recording Consent
Before any voice call is recorded, the caller must provide explicit consent. The recordConsent tool captures:
Whether consent was given (
consent_given: boolean)The method of consent (
consent_method: string)Caller identity (name, number)
Timestamp and metadata
If the caller declines, recording does not proceed. Consent records are stored in the call_consents table with full audit trail.
Code:
Record ConsenttoolHuman element: The caller β a real human β has the final say on whether their call is recorded
4. Identity Verification
Before AI agents can perform sensitive actions (accessing account details, making changes), callers must verify their identity through one or more methods:
Security PIN β caller provides their PIN
Date of birth β caller confirms their DOB
Account number β caller provides their account number
The Verify Identity tool checks these against client_records and logs every attempt (successful or not) to the Identity Verifications table.
Code:
Verify IdentitytoolHuman element: The caller must prove who they are before the AI proceeds β no verification, no access
5. Injection Safe-Wrapping
When prompt injection attempts are detected (10 regex patterns covering DAN mode, system prompt extraction, role-play jailbreaks, etc.), the system does not silently block them. Instead, it:
Wraps the injection in safety markers so the AI model can see it's been flagged
Logs the detection to
ai_usage_logswith feature taginjection_detectedAllows the conversation to continue safely
This approach preserves UX (no mysterious failures) while neutralizing the attack.
Code:
Scan For Injectioninagent-one-safety.tsand edge functionsHuman element: The superadmin can review all injection attempts in the audit log and take action if patterns emerge
6. Output Moderation
AI responses are checked after generation but before delivery to the user. If the output contains flagged content:
The response is replaced with a safety notice
The event is logged to
ai_usage_logswith feature tagconvo_output_moderation_blockThe user is informed that the response was filtered
Code: Output moderation in
convo-chatedge functionHuman element: Harmful content never reaches the end user; the superadmin can review what was blocked
Request Pipeline β Where HITL Checkpoints Sit
For voice calls, two additional HITL checkpoints apply before the conversation begins:
How Users Benefit
Callers are never recorded without consent
The consent flow is mandatory. No consent = no recording. Period.
AI cannot give dangerous advice
Medical, legal, and financial advice is refused by the safety preamble. The AI redirects to human professionals.
Client PII is protected automatically
Email addresses, phone numbers, SSNs, and NINOs are redacted before they reach the AI model.
AI asks instead of guessing
The safety preamble instructs the AI to disclose uncertainty and ask for clarification rather than hallucinating answers.
Harmful content is blocked
Both input and output moderation catch inappropriate content before it affects the conversation.
Identity theft is prevented
Callers must verify their identity before accessing sensitive account information.
How Superadmins Benefit
Full audit trail
Every moderation block, injection detection, and safety event is logged to ai_usage_logs with feature tags
Attack pattern visibility
Injection attempts are logged (not silently dropped), making patterns visible over time
Moderation metrics
Block counts per feature tag enable monitoring of content safety trends
Consent compliance
The Call Consents table provides a complete record for regulatory compliance
Identity verification audit
The Identity Verifications table logs every attempt, including failed ones
What HITL Does NOT Do
Manual review queue
Would add unacceptable latency to real-time conversations. We use automated moderation instead.
Human approval before every response
Would destroy the conversational UX. The safety preamble and output moderation provide equivalent protection at machine speed.
Human-reviewed training data
We use third-party models (OpenAI, Google). We mitigate via prompts, not training.
Escalation to human agents
Currently out of scope. The transfer-to-phone-number feature provides a manual fallback for complex situations.
Related Documentation
Admin - need-to-know basis
Framework Alignment
Our HITL implementation aligns with:
NIST AI RMF (Govern function): Human oversight is a core requirement of the Govern function. Our consent flows and identity verification satisfy this.
EU AI Act (Article 14): Requires "human oversight measures" for AI systems. Our fail-closed moderation and consent flows provide this.
ISO/IEC 42001 (Section 6.1.3): Requires identification of AI risks and human intervention points. Our 6 mechanisms map directly to identified risk areas.
These are reference alignments, not certifications. External auditing is recommended for formal compliance.
Last updated