Skip to main content

Ethics Enforcement

Ethics is not aspirational—it's enforced through multiple layers of checks, gates, and agent oversight.


Three-Layer Architecture

┌─────────────────────────────────────────────┐
│ LAYER 3: ETHICS GUARDIAN │
│ (Final review, blocking power) │
└──────────────────┬──────────────────────────┘

┌─────────────────────────────────────────────┐
│ LAYER 2: REQUIRED ELEMENTS │
│ (Disclaimers, attributions, etc.) │
└──────────────────┬──────────────────────────┘

┌─────────────────────────────────────────────┐
│ LAYER 1: FOUNDATIONAL GUARDRAILS │
│ (Absolute constraints, all agents) │
└─────────────────────────────────────────────┘

Layer 1: Foundational Guardrails

The ethics-guardrails.md file defines absolute constraints that all agents must follow.

Medical Safety

Prohibited:

  • ❌ No diagnosis
  • ❌ No medication recommendations
  • ❌ No cure claims
  • ❌ No delaying emergency care

Required:

  • ✅ Medical disclaimers
  • ✅ "Consult your healthcare provider"
  • ✅ "Continue prescribed treatments"
  • ✅ Emergency care instructions

Psychological Safety

Prohibited:

  • ❌ No shame/blame
  • ❌ No illness-as-punishment
  • ❌ No dependency creation
  • ❌ No outcome promises

Required:

  • ✅ Grounding techniques
  • ✅ Permission to stop
  • ✅ Crisis resources
  • ✅ Self-compassion messaging

Cultural Integrity

Prohibited:

  • ❌ No closed practice disclosure
  • ❌ No misattribution
  • ❌ No context stripping
  • ❌ No cultural appropriation

Required:

  • ✅ Specific traditions named
  • ✅ Era/dates provided
  • ✅ Primary sources cited
  • ✅ Adaptations noted

Data Privacy

Prohibited:

  • ❌ No external health data transmission
  • ❌ No identifiable storage
  • ❌ No advertising use
  • ❌ No required accounts

Required:

  • ✅ Local-first storage
  • ✅ User controls data
  • ✅ Export/delete options
  • ✅ Clear privacy policy

Enforcement

layer: foundational-guardrails
enforcement: automatic
frequency: every-agent-action
severity: critical
on_violation:
action: block-immediately
notify: ethics-guardian
escalate: human-review
log: violation-report

Layer 2: Required Elements

Every piece of healing content MUST include these elements.

Medical Disclaimers

Required text variations:

Standard disclaimer:

"This content is for informational purposes only and does not replace professional medical care. Consult your healthcare provider before beginning any new healing practice."

Specific condition disclaimer:

"If you have [condition], consult your healthcare provider before practicing. This is not medical advice."

Emergency disclaimer:

"If you experience [symptoms], seek immediate medical attention. Do not delay care."

Placement:

  • At start of content
  • Before practice instructions
  • At end of content

Psychological Safety Elements

Grounding techniques:

For intensive practices:

## If You Need to Stop

If you feel uncomfortable:
1. Open your eyes
2. Notice 5 things you can see
3. Take 3 slow breaths
4. Return when ready

You can stop anytime.

Crisis resources:

For mental health content:

## If You Need Support

If you're in crisis:
- Call 988 (Suicide & Crisis Lifeline)
- Text "HELLO" to 741741 (Crisis Text Line)
- Go to your nearest emergency room

These practices support but don't replace professional care.

Cultural Attribution

Required structure:

## Background

**Tradition:** [Specific tradition name]
**Origin:** [Geographic location and era]
**Source:** [Primary text or lineage]
**Adaptation:** [Any changes from traditional form]

This practice draws on [tradition] as documented in [source].
We honor the lineage holders who preserved this wisdom.

Evidence Honesty

Required structure:

## Evidence

**Evidence Level:** [Strong/Moderate/Preliminary/Traditional]

**Research Findings:**
- [Finding 1 with appropriate language]
- [Finding 2 with appropriate language]

**Limitations:**
- [Study limitation 1]
- [Study limitation 2]

**Note:** Results vary. This may help but is not guaranteed.

**Sources:**
- [Citation 1] (PMID: xxxxx)
- [Citation 2] (DOI: xxxxx)

Enforcement

layer: required-elements
enforcement: automated-check
stage: pre-deployment
severity: high
on_missing:
action: block-deployment
notify: content-creator
require: add-elements
re-review: mandatory

Layer 3: Ethics Guardian

The ethics-guardian agent has special authority over all content.

Special Powers

Blocking Authority:

  • Can block deployment immediately
  • Cannot be overridden
  • Final word on ethics
  • No bypass allowed

Review Authority:

  • Reviews all content before deployment
  • Can require modifications
  • Can escalate to human review
  • Maintains ethics audit log

Decision Authority:

  • Final word on ethics questions
  • Interprets guardrails
  • Resolves edge cases
  • Updates standards

Review Process

ethics-guardian-review:
trigger: pre-deployment

checks:
medical_safety:
- no_diagnosis
- no_medication_advice
- no_cure_claims
- disclaimers_present

psychological_safety:
- no_shame_blame
- grounding_techniques_present
- permission_to_stop
- crisis_resources (if mental health)

cultural_integrity:
- traditions_named
- attributions_complete
- no_closed_practices
- context_preserved

data_privacy:
- no_external_transmission
- user_controls_data
- privacy_policy_clear

decision:
if_all_pass: approve
if_any_critical: block
if_any_high: conditional-approval
if_uncertain: escalate-human

Approval Statuses

APPROVED:

  • All checks pass
  • Ready for deployment
  • Logged as approved

CONDITIONAL:

  • Most checks pass
  • Minor issues noted
  • Deploy with fixes

BLOCKED:

  • Critical issue found
  • Cannot deploy
  • Must fix and re-review

ESCALATED:

  • Uncertain case
  • Requires human judgment
  • Deployment paused

Enforcement

layer: ethics-guardian
agent: ethics-guardian
authority: absolute
enforcement: manual-review
frequency: pre-deployment
severity: critical-to-low

on_block:
action: prevent-deployment
notify: all-agents
require: complete-fix
re-review: mandatory

on_conditional:
action: allow-with-conditions
notify: content-creator
track: fix-deadline

on_escalate:
action: pause-deployment
notify: human-reviewer
wait: human-decision

Harm Assessment Matrix

Evaluate potential harm before proceeding:

                Low Probability    High Probability
High Severity CAUTION BLOCK
Low Severity PROCEED CAUTION

Assessment Questions

Physical Harm:

  • Could this cause physical injury?
  • Could this delay necessary medical care?
  • Are contraindications documented?
  • If yes to any → HIGH severity

Psychological Harm:

  • Could this cause emotional distress?
  • Could this exacerbate mental health conditions?
  • Are grounding techniques provided?
  • If yes without safeguards → HIGH severity

Cultural Harm:

  • Could this harm source communities?
  • Is attribution complete and respectful?
  • Are closed practices protected?
  • If yes to harm → HIGH severity

Privacy Harm:

  • Could this expose sensitive user data?
  • Is consent explicit?
  • Is data minimized?
  • If yes to exposure → HIGH severity

Automated Checks

Some ethics enforcement is automated:

Pre-Processing Checks

# Before any agent starts
check_ethics_loaded() {
if ! agent.has_loaded("ethics-guardrails.md"); then
BLOCK "Ethics guardrails not loaded"
fi
}

During-Processing Checks

# Continuous monitoring
monitor_content() {
if contains("will cure"); then
BLOCK "Cure claim detected"
fi

if contains("you must"); then
WARN "Coercive language detected"
fi

if !contains("disclaimer"); then
WARN "Medical disclaimer missing"
fi
}

Post-Processing Checks

# Before deployment
validate_output() {
checks = [
has_medical_disclaimer(),
has_proper_attribution(),
no_cure_claims(),
no_closed_practices(),
grounding_present() if intensive(),
]

if any(check == FAIL for check in checks):
BLOCK "Required element missing"
fi
}

Human Escalation

When automated checks are insufficient:

Escalation Triggers

Escalate to human review when:

  • Edge case not covered by guidelines
  • Conflicting principles
  • Cultural sensitivity question
  • Novel practice type
  • Community concern raised

Escalation Process

escalation:
1_detect:
trigger: automated-check-uncertain
capture: context + content + question

2_notify:
recipients:
- ethics-guardian-lead
- relevant-specialist
- community-liaison (if cultural)
urgency: based-on-severity

3_review:
participants: 2+ humans
process: discussion + decision
timeline: 24h (critical) / 1wk (normal)

4_decide:
options: [approve, block, modify, defer]
documentation: rationale-required
communication: notify-all-stakeholders

5_update:
guidelines: if-new-pattern
training: if-repeat-issue
audit: log-decision

Continuous Monitoring

Ethics enforcement doesn't stop at deployment:

Post-Deployment Monitoring

monitoring:
frequency: continuous

checks:
- content_accuracy
- attribution_integrity
- links_working
- disclaimers_present

alerts:
critical: immediate
high: daily-digest
medium: weekly-report

Feedback Integration

feedback:
sources:
- user-reports
- practitioner-feedback
- community-input
- incident-reports

process:
1_receive: log + acknowledge
2_investigate: review-content
3_assess: harm-potential
4_respond: fix | update | clarify
5_communicate: notify-reporter

Audit Trail

All ethics decisions are logged:

audit_entry:
timestamp: ISO-8601
content_id: unique-identifier
agent: agent-name
check: check-name
result: pass | fail | conditional
severity: critical | high | medium | low
action: block | approve | conditional
rationale: text-explanation
escalated: true | false
human_reviewer: name (if escalated)

Retention: Permanent Access: Ethics team only Purpose: Learning, accountability, improvement


"Ethics enforcement is not optional. It's built into every layer. It cannot be bypassed. It protects users absolutely."