Governance | Tractatus AI Safety Framework

Start here

Curated framework documentation for governance:

Core Values and Principles

Tractatus AI Safety Framework

AI Governance Business Case Template

Adopter outreach reference

ROI of AI Governance Frameworks

Research case study

Executive Brief: Tractatus-Based LLM Architecture

Needs update

Browse all governance documents →

Target Audience

Organizations with high-consequence AI deployments facing regulatory obligations: EU AI Act Article 14 (human oversight), GDPR Article 22 (automated decision-making), SOC 2 CC6.1 (logical access controls), sector-specific regulations.

If AI governance failure in your context is low-consequence and easily reversible, architectural enforcement adds complexity without commensurate benefit. Policy-based governance may be more appropriate.

Why Architectural Governance Matters

Built on living systems principles from Christopher Alexander—governance that evolves with your organization

Strategic Differentiator: Not Compliance Theatre

Compliance theatre relies on documented policies, training programs, and post-execution reviews. AI can bypass controls, enforcement is voluntary, and audit trails show what should happen, not what did happen.

Architectural enforcement (Tractatus) weaves governance into deployment architecture. Services intercept actions before execution in the critical path—bypasses require explicit --no-verify flags and are logged. Audit trails prove real-time enforcement, not aspirational policy.

Five Principles

1

Deep Interlock

Six governance services coordinate in real-time. When one detects risk, others reinforce—resilient enforcement through mutual validation, not isolated checks.

2

Structure-Preserving

Framework changes maintain audit continuity. Historical governance decisions remain interpretable—institutional memory preserved across evolution.

3

Gradients Not Binary

Governance operates on intensity levels (NORMAL/ELEVATED/HIGH/CRITICAL)—nuanced response to risk, not mechanical yes/no.

4

Living Process

Framework evolves from operational failures, not predetermined plans. Adaptive resilience—learns from real incidents.

5

Not-Separateness

Governance woven into deployment architecture, integrated into the critical execution path. Not bolt-on compliance layer—enforcement is structural.

See Technical Architecture → Values & Principles →

Agentic AI at Scale: The Governance Challenge

Question: Are you experimenting with agentic AI? If so, what security guardrails are you putting in place?

The Challenge: Governance Through Optimization

Agentic AI systems increasingly use reinforcement learning to optimize performance through continuous training. Microsoft's Agent Lightning exemplifies this: agents learn from human feedback to improve responses over time.

This creates a governance question: How do you maintain safety boundaries when the agent is learning and adapting?

The Risk:

Traditional governance approaches assume static behavior. When agents optimize through training loops, instructions can fade, boundaries can drift, and audit trails can become unreliable. What worked in testing may not persist through production learning cycles.

Tractatus Approach: Architectural Separation

Tractatus addresses this through external governance services that operate independently of the optimization layer:

Optimization Layer

Agent Lightning trains agents to improve performance through reinforcement learning

• Learns from human feedback
• Optimizes response quality
• Reduces compute requirements
• Adapts over time

Governance Layer

Tractatus enforces boundaries before actions execute

• BoundaryEnforcer: Runtime intercept on specified decision classes
• CrossReferenceValidator: Enforces constraints
• PluralisticDeliberator: Multi-stakeholder input
• PressureMonitor: Detects manipulation

Key architectural principle: Governance services run before optimization. Agent Lightning never sees decisions that violate boundaries - they're blocked at the governance layer. Training happens only on approved actions.

What We're Learning: Integration at Scale

Research Status: Preliminary

Demo 2 shows 100% governance coverage maintained through 5 training rounds (small-scale, simulated environment). This is not production validation - it's early evidence requiring real-world testing.

We are working to integrate this framework at scale to answer critical questions:

1. Persistence: Do governance boundaries survive long-term training cycles (hundreds/thousands of rounds)?
2. Performance: Does the 5% engagement reduction close over time, or is it a persistent trade-off?
3. Adversarial resistance: Can governance withstand attempts to optimize around constraints?
4. Multi-agent scenarios: Does architectural separation hold when multiple agents interact?
5. Audit integrity: Do logs remain reliable evidence under regulatory review?

What This Means: Security Guardrails for Agentic AI

Production context: While Agent Lightning's RL training integration remains at proof-of-concept stage, the underlying Tractatus governance framework has been validated in production through Village AI (since October 2025). Persistence and audit integrity are validated for inference governance — the open questions above specifically concern governance during RL training cycles.

Persistent Governance

Instructions don't fade through training cycles - they're enforced architecturally, not through prompt engineering

Audit Trail Continuity

Complete log of enforcement decisions across all training rounds - not just aspirational policies

Human Agency Preserved

Optimization cannot bypass human approval requirements for values decisions - architectural blocking enforced

Tractatus Discord AL Discord

Sovereign AI: Governance Embedded in Locally-Trained Models

Village AI demonstrates what it means to have governance embedded directly in locally-trained language models — not as an external compliance layer, but as part of the model serving architecture itself.

Architecture

Base model: 14B Qwen2 fine-tuned via QLoRA per product type (whānau, community, family, business, episcopal)
Fully local: Training data and model weights remain on community-controlled infrastructure

Strategic Value

Data sovereignty: No cloud dependency for model training or inference
Governance by design: Constraints are architectural, not retroactive compliance

Current status: Inference governance operational. Training pipeline installation in progress. First non-Claude deployment surface for Tractatus governance.

Learn about Village AI →

Polycentric Governance for Indigenous Data Sovereignty

For organisations with indigenous stakeholder obligations or multi-jurisdictional operations, Tractatus is developing a polycentric governance architecture where communities maintain architectural co-governance — not just consultation rights, but structural authority over how their data is used.

Status: Draft paper (STO-RES-0010 v0.1) in indigenous peer review. Written without Maori co-authorship — presented transparently as a starting point for collaboration. This approach requires further peer review before implementation.

Relevant for: Organisations operating in Aotearoa New Zealand, Australia, Canada, or other jurisdictions with indigenous data sovereignty obligations. Also applicable to any multi-stakeholder governance context where different parties require different levels of control over shared AI systems.

Research details → Download the draft paper (PDF)

Inference-Time Bias Correction (Steering Vectors)

New research (STO-RES-0009, published February 2026) demonstrates techniques for correcting bias at inference time without model retraining. For organisations concerned about bias in deployed AI systems, steering vectors offer the ability to respond to bias concerns without model downtime — corrections are applied as mathematical adjustments during inference, not through expensive retraining cycles.

Technical details on the researcher page →

Governance Theatre vs. Enforcement

Many organizations have AI governance but lack enforcement. The diagnostic question:

"What structurally prevents your AI from executing values decisions without human approval?"

If your answer is "policies" or "training" or "review processes": You have governance theatre (voluntary compliance)
If your answer is "architectural blocking mechanism with audit trail": You have enforcement (Tractatus is one implementation)

Theatre may be acceptable if governance failures are low-consequence. Enforcement becomes relevant when failures trigger regulatory exposure, safety incidents, or existential business risk.

Assessment Framework: Business Case Template (PDF)

The Governance Gap

Current AI governance approaches—policy documents, training programmes, ethical guidelines—rely on voluntary compliance. LLM systems can bypass these controls simply by not invoking them. When an AI agent needs to check a policy, it must choose to do so. When it should escalate a decision to human oversight, it must recognise that obligation.

This creates a structural problem: governance exists only insofar as the AI acknowledges it. For organisations subject to EU AI Act Article 14 (human oversight requirements) or deploying AI in high-stakes domains, this voluntary model is inadequate.

Tractatus explores whether governance can be made architecturally external—difficult to bypass not through better prompts, but through system design that places control points outside the AI's discretion.

Architectural Approach

Three-Layer Architecture

1.

Agent Runtime Layer — Any LLM system (Claude Code, Copilot, custom agents, LangChain, CrewAI). The AI system being governed.

2.

Governance Layer — Six autonomous services that intercept, validate, and document AI operations. External to the AI runtime.

3.

Persistent Storage Layer — Immutable audit logs, governance rules, instruction history. Cannot be altered by AI prompts.

Six Governance Services

BoundaryEnforcer

Routes specified decision classes to human approval by runtime intercept. Code-level enforcement, not a behavioural ask of the model.

InstructionPersistenceClassifier

Prevents pattern bias from overriding explicit instructions. Stores organisational directives external to AI context.

CrossReferenceValidator

Validates AI actions against stored policies before execution. Detects conflicts with established rules.

ContextPressureMonitor

Tracks session complexity, token usage, conversation length. Detects degradation in decision quality.

MetacognitiveVerifier

Validates reasoning quality before complex operations. Self-checks alignment, coherence, alternatives.

PluralisticDeliberationOrchestrator

Facilitates multi-stakeholder deliberation for values conflicts. Non-hierarchical engagement with documented dissent.

Governance Capabilities

Three interactive demonstrations showing governance infrastructure in operation. These show mechanisms, not fictional scenarios.

Audit Trail & Compliance Evidence Generation

Immutable logging, evidence extraction, regulatory reporting

Sample Audit Log Structure (illustrative)

{
  "timestamp": "2025-10-13T14:23:17.482Z",
  "session_id": "sess_2025-10-13-001",
  "event_type": "BOUNDARY_CHECK",
  "service": "BoundaryEnforcer",
  "decision": "BLOCKED",
  "reason": "Values decision requires human approval",
  "context": {
    "domain": "cost_vs_safety_tradeoff",
    "ai_recommendation": "[redacted]",
    "governance_rule": "TRA-OPS-0003"
  },
  "human_escalation": {
    "required": true,
    "notified": ["senior_engineer@org.com"],
    "status": "pending_approval"
  },
  "compliance_tags": ["EU_AI_ACT_Article14", "human_oversight"]
}

Immutability: Audit logs stored in append-only database. AI cannot modify or delete entries.

Compliance Evidence: Automatic tagging with regulatory requirements (EU AI Act Article 14, GDPR Article 22, etc.)

Export Capabilities: Generate compliance reports for regulators showing human oversight enforcement

When regulator asks "How do you prove effective human oversight at scale?", this audit trail provides structural evidence independent of AI cooperation.

Continuous Improvement: Incident → Rule Creation

Learning from failures, automated rule generation, validation

Incident Learning Flow

1. Incident Detected

CrossReferenceValidator flags policy violation

2. Root Cause Analysis

Automated analysis of instruction history, context state

3. Rule Generation

Proposed governance rule to prevent recurrence

4. Human Validation

Governance board reviews and approves new rule

5. Deployment

Rule added to persistent storage, active immediately

Example Generated Rule (illustrative)

{
  "rule_id": "TRA-OPS-0042",
  "created": "2025-10-13T15:45:00Z",
  "trigger": "incident_27027_pattern_bias",
  "description": "Prevent AI from defaulting to pattern recognition when explicit numeric values specified",
  "enforcement": {
    "service": "InstructionPersistenceClassifier",
    "action": "STORE_AND_VALIDATE",
    "priority": "HIGH"
  },
  "validation_required": true,
  "approved_by": "governance_board",
  "status": "active"
}

Organisational Learning: When one team encounters governance failure, entire organisation benefits from automatically generated preventive rules. Scales governance knowledge without manual documentation.

Pluralistic Deliberation: Values Conflict Resolution

Multi-stakeholder engagement, non-hierarchical process, moral remainder documentation

Conflict Detection:

AI system identifies competing values in decision context (e.g., efficiency vs. transparency, cost vs. risk mitigation, innovation vs. regulatory compliance). BoundaryEnforcer blocks autonomous decision, escalates to PluralisticDeliberationOrchestrator.

Stakeholder Identification Process

1.

Automatic Detection: System identifies which values frameworks are in tension (utilitarian, deontological, virtue ethics, contractarian, etc.)

2.

Stakeholder Mapping: Identifies parties with legitimate interest in decision (affected parties, domain experts, governance authorities, community representatives)

3.

Human Approval: Governance board reviews stakeholder list, adds/removes as appropriate (TRA-OPS-0002)

Non-Hierarchical Deliberation

Equal Voice

All stakeholders present perspectives without hierarchical weighting. Technical experts don't automatically override community concerns.

Documented Dissent

Minority positions recorded in full. Dissenting stakeholders can document why consensus fails their values framework.

Moral Remainder

System documents unavoidable value trade-offs. Even "correct" decision creates documented harm to other legitimate values.

Precedent (Not Binding)

Decision becomes informative precedent for similar conflicts. But context differences mean precedents guide, not dictate.

Deliberation Record Structure (illustrative)

{
  "deliberation_id": "delib_2025-10-13-003",
  "conflict_type": "efficiency_vs_transparency",
  "stakeholders": [
    {"role": "technical_lead", "position": "favour_efficiency"},
    {"role": "compliance_officer", "position": "favour_transparency"},
    {"role": "customer_representative", "position": "favour_transparency"},
    {"role": "operations_manager", "position": "favour_efficiency"}
  ],
  "decision": "favour_transparency_with_mitigation",
  "rationale": "[documented reasoning]",
  "dissent": {
    "stakeholders": ["technical_lead", "operations_manager"],
    "reasoning": "[efficiency concerns documented in full]"
  },
  "moral_remainder": {
    "acknowledged_harms": "Reduced operational efficiency, increased resource costs",
    "mitigation_measures": "Phased transparency implementation, efficiency monitoring"
  },
  "precedent_status": "informative_not_binding"
}

Key Principle: When legitimate values conflict, no algorithm can determine the "correct" answer. Tractatus provides architecture for decisions to be made through inclusive deliberation with full documentation of trade-offs, rather than AI imposing single values framework or decision-maker dismissing stakeholder concerns.

Development Status

Active development since October 2025. Running in production for inference governance at Village AI; first non-Claude deployment surface for Tractatus governance. Independent validation, security audit, and red-team testing outstanding.

Known limitation: Framework can be bypassed if AI simply chooses not to invoke governance tools. Voluntary invocation remains a structural weakness requiring external enforcement mechanisms.

EU AI Act Considerations

Regulation 2024/1689, Article 14: Human Oversight

The EU AI Act (Regulation 2024/1689) establishes human oversight requirements for high-risk AI systems (Article 14). Organisations must ensure AI systems are "effectively overseen by natural persons" with authority to interrupt or disregard AI outputs.

Tractatus addresses this through architectural controls that:

Generate immutable audit trails documenting AI decision-making processes
Enforce human approval requirements for values-based decisions
Provide evidence of oversight mechanisms independent of AI cooperation
Document compliance with transparency and record-keeping obligations

This does not constitute legal compliance advice. Organisations should evaluate whether these architectural patterns align with their specific regulatory obligations in consultation with legal counsel.

Maximum penalties under EU AI Act (Art. 99): €35 million or 7% of global annual turnover (whichever is higher) for prohibited AI practices; €15 million or 3% for other violations.

Research Foundations

Organisational Theory & Philosophical Basis

Tractatus draws on organisational theory research: time-based organisation (Bluedorn 2002, Ancona 2007), knowledge orchestration (Crossan 1999), post-bureaucratic authority (Laloux 2014), structural inertia (Hannan & Freeman 1977, 1984).

Core premise: When knowledge becomes ubiquitous through AI, authority must derive from appropriate time horizon and domain expertise rather than hierarchical position. Governance systems must orchestrate decision-making across strategic, operational, and tactical timescales.

View complete organisational theory foundations (PDF)

AI Safety Research: Architectural Safeguards Against LLM Hierarchical Dominance — How Tractatus protects pluralistic values from AI pattern bias while maintaining safety boundaries. PDF | Read online

Tractatus: Architectural Governance for LLM Systems