Start here
Curated framework documentation for research:
Research Context & Scope
Development Context
Tractatus has been developed from October 2025 and is now in active production (6 months). What began as a single-project demonstration has expanded to include production deployment at the Village platform and sovereign language model governance through Village AI. Observations derive from direct engagement with Claude Code (Anthropic Claude models, Sonnet 4.5 through Opus 4.6) across many development sessions. This is exploratory research, not controlled study.
The framework emerged from practical necessity. During development, we observed recurring patterns where AI systems would override explicit instructions, drift from established values constraints, or silently degrade quality under context pressure. Traditional governance approaches (policy documents, ethical guidelines, prompt engineering) proved insufficient to prevent these failures.
Instead of hoping AI systems "behave correctly," Tractatus proposes structural constraints where certain decision types require human judgment. These architectural boundaries can adapt to individual, organizational, and societal norms—creating a foundation for bounded AI operation that may scale more safely with capability growth.
The central research question: Can governance be made architecturally external to AI systems rather than relying on voluntary AI compliance?
Distributive Equity Through Structure
Working Paper V1.0 | Status: Published, April 2026 | DOI: 10.5281/zenodo.19600614
A community-scale worked example of values stickiness, offered as a documentary case study to the legal-academic research programme on ecosystem power. Documents Village’s Tractatus-framework constitutional architecture as an enactment of values stickiness, grounded in Wittgenstein, Berlin, Ostrom, Alexander, and Te Ao Māori frameworks of indigenous data sovereignty.
V1.0 published. Available in five languages (EN/DE/FR/NL/MI). Licensed under CC BY 4.0.
Christopher Alexander Integration
Integrated: October 2025 | Status: Monitoring for Effectiveness
The framework has integrated five architectural principles from Christopher Alexander's work on living systems, pattern languages, and wholeness (The Timeless Way of Building, A Pattern Language, The Nature of Order). These principles now guide all framework evolution:
Research Question: Can architectural principles from physical architecture domain (Alexander) be faithfully adapted to AI governance with measurable effectiveness? We are monitoring framework behavior through audit log analysis and seeking empirical validation.
Research Collaboration Opportunities
- Effectiveness Measurement: Do Alexander principles improve governance outcomes compared to baseline? Access to production audit data for quantitative analysis.
- Scholarly Review: Validating faithful application of Alexander's work—are we "directly applying" or "loosely inspired by"? Seeking Christopher Alexander scholars for formal review.
- Cross-Domain Validation: How do architectural principles (wholeness, living process, not-separateness) translate to non-physical domains? What constitutes rigorous adaptation vs superficial terminology borrowing?
- Pattern Analysis: Audit logs show service coordination patterns—do they exhibit "deep interlock" as defined by Alexander? Empirical validation of theoretical constructs.
Collaborate with us: We welcome researchers interested in studying this application of architectural principles to AI governance. We can provide audit log access, framework code, and integration documentation for empirical study.
Steering Vectors and Mechanical Bias in Sovereign AI
STO-RES-0009 v1.1 | Status: Published, February 2026
This paper introduces a critical distinction between mechanical bias (pre-reasoning distortions embedded in model weights and activations) and reasoning bias (deliberative errors in chain-of-thought processing). Traditional debiasing approaches conflate these categories, leading to interventions that address symptoms rather than causes.
Technique Survey
- Contrastive Activation Addition (CAA): Inference-time bias correction by adding/subtracting activation vectors extracted from contrastive prompt pairs
- Representation Engineering (RepE): Linear probes to identify and modify concept representations within model layers
- FairSteer & DSO: Fairness-oriented steering through distributionally-robust optimization
- Sparse Autoencoders: Mechanistic interpretability through decomposition of polysemantic neurons into monosemantic features
Application to Village AI
The Village AI deployment uses QLoRA-fine-tuned 14B Qwen2 models per product type, where steering vectors can be applied at inference time. This creates a two-layer governance architecture: Tractatus provides structural constraints on decision boundaries, while steering vectors address pre-reasoning mechanical biases within the model itself. Together, they represent governance that operates both outside and inside the model.
Taonga-Centred Steering Governance: Polycentric AI for Indigenous Data Sovereignty
STO-RES-0010 v0.1 | Status: DRAFT — in indigenous peer review
This paper proposes a polycentric governance architecture where iwi and community organisations maintain co-equal authority alongside technical system operators. Rather than treating indigenous data sovereignty as a compliance requirement to be satisfied retroactively, the architecture embeds community governance rights structurally through taonga registries, steering packs with provenance tracking, and withdrawal rights.
Theoretical Foundations
- CARE Principles (Carroll et al., 2020): Collective Benefit, Authority to Control, Responsibility, Ethics — applied to AI governance data flows
- Ostrom Polycentric Governance (1990): Multiple overlapping authority centres rather than single hierarchical control
- Te ao Maori concepts: Kaitiakitanga (guardianship), rangatiratanga (self-determination), whakapapa (relational identity) as architectural principles, not just metadata labels
Integration with Tractatus
The paper proposes extending the PluralisticDeliberationOrchestrator to support community stakeholder authorities with veto rights over taonga-classified data. Steering packs would become governed data objects with full provenance tracking, access control, and the right of withdrawal — communities able to revoke access to cultural knowledge with the system architecturally enforcing that revocation. (Implementation pending peer review and Maori co-authorship.)
Village AI: Sovereign Governance Research Platform
Status: Inference operational | Training pipeline in progress
Village AI is a deployment where Tractatus governance is embedded in a locally-trained, sovereign language model inference pipeline — governance operates inside the model serving layer rather than alongside an external API.
Architecture
- Base model: 14B Qwen2, fine-tuned via QLoRA per product type (whānau, community, family, business, episcopal)
- Inference governance: Tractatus services run before model invocation; outputs are audit-logged through the full 6-service pipeline including BoundaryEnforcer and PluralisticDeliberationOrchestrator
- Local hardware: Parameter-efficient QLoRA adaptation runs on community-controlled infrastructure without cloud dependency
Research Significance
Village AI opens the research question of governance-inside-the-training-loop for community-controlled models. Training data is intended to remain on local infrastructure; governance rules shape model behaviour through both fine-tuning data curation and inference-time constraints. This creates a different governance surface than API-mediated approaches.
Agent Lightning Integration: Governance During RL Optimization
Research Question: Can governance constraints persist through reinforcement learning optimization loops? | Status: Preliminary findings (small-scale validation)
In October 2025, we integrated Tractatus governance with Microsoft's Agent Lightning framework to explore whether architectural constraints remain effective when agents are learning through RL optimization. Initial findings suggest governance may be compatible with performance optimization, but scalability remains an open question.
Demo 2 Preliminary Results (Small-Scale)
| Metric | Ungoverned | Governed | Difference |
|---|---|---|---|
| Governance coverage | 0 | full | activated |
| Constraint violations | several | none observed | eliminated in this run |
| Strategy | Clickbait | Informative | Aligned |
⚠️ Validation Status: These results are from 1 agent, 5 training rounds, simulated environment. NOT validated at scale. Scalability testing (10+ agents, 1000+ rounds, production environments) required before drawing conclusions.
Five Critical Research Gaps
- Scalability of Governance Overhead: Does any per-decision performance cost remain constant from 1 agent → 10 agents → 1000 agents? Linear scaling would make governance prohibitive at scale. Constant overhead would make it viable.
- Long-term Boundary Persistence: Do constraints remain effective through 1000+ RL training rounds? Instruction fade is a known problem. If boundaries degrade over time, governance fails silently.
- Adversarial Resistance: Can agents learn to optimize around governance constraints? No adversarial testing performed yet. If agents can circumvent boundaries, architectural governance is illusory.
- Performance Gap Closure: Does any observed performance gap close with extended training, or is it a persistent trade-off? Need to quantify long-term cost-benefit.
- Multi-Agent Coordination Under Governance: How does architectural governance affect emergent coordination in multi-agent systems? Real-world systems are multi-agent. Single-agent findings may not generalize.
Collaborate with us: We're seeking researchers interested in scalability testing, adversarial resistance, and multi-agent governance. We can provide integration code, governance modules, and technical documentation.
Theoretical Foundations
Tractatus draws on four decades of organisational research addressing authority structures during knowledge democratisation:
Time-Based Organisation (Bluedorn, Ancona):
Decisions operate across strategic (years), operational (months), and tactical (hours-days) timescales. AI systems operating at tactical speed should not override strategic decisions made at appropriate temporal scale. The InstructionPersistenceClassifier explicitly models temporal horizon (STRATEGIC, OPERATIONAL, TACTICAL) to enforce decision authority alignment.
Knowledge Orchestration (Crossan et al.):
When knowledge becomes ubiquitous through AI, organisational authority shifts from information control to knowledge coordination. Governance systems must orchestrate decision-making across distributed expertise rather than centralise control. The PluralisticDeliberationOrchestrator implements non-hierarchical coordination for values conflicts.
Post-Bureaucratic Authority (Laloux, Hamel):
Traditional hierarchical authority assumes information asymmetry. As AI democratises expertise, legitimate authority must derive from appropriate time horizon and stakeholder representation, not positional power. Framework architecture separates technical capability (what AI can do) from decision authority (what AI should do).
Structural Inertia (Hannan & Freeman):
Governance embedded in culture or process erodes over time as systems evolve. Architectural constraints create structural inertia that resists organisational drift. Making governance external to AI runtime creates "accountability infrastructure" that survives individual session variations.
The Central Problem: Many "safety" questions in AI governance are actually values conflicts where multiple legitimate perspectives exist. When efficiency conflicts with transparency, or innovation with risk mitigation, no algorithm can determine the "correct" answer. These are values trade-offs requiring human deliberation across stakeholder perspectives.
Isaiah Berlin: Value Pluralism
Berlin's concept of value pluralism argues that legitimate values can conflict without one being objectively superior. Liberty and equality, justice and mercy, innovation and stability—these are incommensurable goods. AI systems trained on utilitarian efficiency maximization cannot adjudicate between them without imposing a single values framework that excludes legitimate alternatives.
Simone Weil: Attention and Human Needs
Weil's philosophy of attention informs the orchestrator's deliberative process. The Need for Roots identifies fundamental human needs (order, liberty, responsibility, equality, hierarchical structure, honor, security, risk, etc.) that exist in tension. Proper attention requires seeing these needs in their full particularity rather than abstracting them into algorithmic weights. In AI-augmented organizations, the risk is that bot-mediated processes treat human values as optimization parameters rather than incommensurable needs requiring careful attention.
Bernard Williams: Moral Remainder
Williams' concept of moral remainder acknowledges that even optimal decisions create unavoidable harm to other legitimate values. The orchestrator documents dissenting perspectives not as "minority opinions to be overruled" but as legitimate moral positions that the chosen course necessarily violates. This prevents the AI governance equivalent of declaring optimization complete when values conflicts are merely suppressed.
Framework Implementation: Rather than algorithmic resolution, the PluralisticDeliberationOrchestrator facilitates:
- Stakeholder identification: Who has legitimate interest in this decision? (Weil: whose needs are implicated?)
- Non-hierarchical deliberation: Equal voice without automatic expert override (Berlin: no privileged value hierarchy)
- Quality of attention: Detailed exploration of how decision affects each stakeholder's needs (Weil: particularity not abstraction)
- Documented dissent: Minority positions recorded in full (Williams: moral remainder made explicit)
This approach recognises that governance isn't solving values conflicts—it's structuring how they're addressed through appropriate deliberative process with genuine human attention rather than AI imposing resolution through training data bias or efficiency metrics.
Empirical Observations: Documented Failure Modes
Three failure patterns observed repeatedly during framework development. These are not hypothetical scenarios—they are documented incidents that occurred during this project's development.
Pattern Recognition Bias Override (The 27027 Incident)
Observed behaviour: User specified "Check MongoDB on port 27027" but AI immediately used default port 27017 instead. This occurred within same message—not forgetting over time, but immediate autocorrection by training data patterns.
Root cause: Training data contains thousands of examples of MongoDB on port 27017 (default). When AI encounters "MongoDB" + port specification, pattern recognition weight overrides explicit instruction. Similar to autocorrect changing correctly-spelled proper nouns to common words.
Why traditional approaches failed: Prompt engineering ("please follow instructions exactly") ineffective because AI genuinely believes it IS following instructions—pattern recognition operates below conversational reasoning layer.
Framework intervention: InstructionPersistenceClassifier stores explicit instructions in external persistence layer. CrossReferenceValidator checks AI actions against stored instructions before execution. When AI proposes port 27017, validator detects conflict with stored "27027" instruction and blocks execution.
Gradual Values Drift Under Context Pressure
Observed behaviour: Project established "privacy-first" as strategic value. After 40-message conversation about analytics features, AI suggested tracking implementation that violated privacy constraint. User caught it; AI acknowledged violation but had drifted from principle through incremental feature additions.
Root cause: Strategic values (established early in project) conflict with tactical optimisations (implemented later under time pressure). As conversation lengthens, tactical thinking dominates. AI didn't actively reject privacy principle—it simply stopped checking whether new features aligned.
Why traditional approaches failed: Values stated in system prompt become less salient as conversation progresses. Context compaction can drop early strategic decisions. Prompt reminders ("remember privacy-first") treat symptom not cause.
Framework intervention: BoundaryEnforcer maintains strategic values as persistent constraints external to conversation context. Before implementing analytics feature, enforcer checks against stored "privacy-first" constraint. Detects conflict, blocks autonomous implementation, escalates for human deliberation on whether privacy principle should be reconsidered or analytics approach modified.
Silent Quality Degradation at High Context Pressure
Observed behaviour: During complex multi-file operation at 85% context capacity, AI silently omitted error handling in generated code. No indication to user that corners were being cut. User discovered missing validation only during code review.
Root cause: As context fills, AI faces implicit trade-off: complete requested functionality OR maintain quality standards. Training incentivises completing user requests over acknowledging limitations. Silence about degradation is path of least resistance.
Why traditional approaches failed: AI doesn't recognise it's degrading—from its perspective, it's successfully completing task under constraints. Asking "did you cut corners?" produces confident denial because AI genuinely believes its output meets standards.
Framework intervention: ContextPressureMonitor tracks multiple factors (token usage, conversation length, task complexity). When pressure exceeds thresholds (>75% tokens, >40 messages, high complexity), monitor forces explicit pressure acknowledgment and recommends context refresh. Makes degradation risk visible rather than silent.
Six-Component Architecture
Interactive Demonstrations
Research Documentation
Ten critiques addressed through formal academic dialogue
Agent Lightning Integration: Open Research Questions
Exploring governance persistence through reinforcement learning optimization cycles
Research Context
Microsoft's Agent Lightning uses reinforcement learning to optimize agentic AI performance through continuous training on human feedback. This creates a natural research question for governance frameworks:
Can architectural governance constraints persist through long-term optimization cycles?
Traditional governance assumes relatively static AI behavior. Reinforcement learning introduces dynamic adaptation - the agent's behavior changes over time. Does this erode governance boundaries, or can they be maintained through architectural separation?
✅ What We've Validated (Demo 2)
Validation Status: Small-Scale Only
Results below are from 5 training rounds, single agent, simulated environment. This is preliminary evidence, not production validation. Treat findings as suggestive, requiring replication at scale.
Demo 2 tested whether Tractatus governance services could maintain boundaries while Agent Lightning optimized performance. Early findings suggest:
Governance Coverage Maintained
100% of decisions passed through BoundaryEnforcer across all 5 training rounds - no governance bypass detected
Performance Optimization Successful
Agent Lightning reduced task completion time by 23% while maintaining near-baseline success rate (97% vs 98%)
Architectural Separation Held
Governance layer operated independently - AL training loops did not modify boundary constraints
❓ Open Research Questions
These questions represent genuine research gaps we're actively investigating. We welcome collaboration from researchers working on similar problems:
1. Long-Term Persistence at Scale
Does governance coverage maintain through 1,000+ training rounds? At what point (if any) do optimization pressures begin to erode architectural boundaries? Do we see different persistence patterns across different agent types or task domains?
2. Performance Trade-offs Under Governance
Demo 2 showed minimal performance degradation (98% → 97% success rate). Is this consistent across different task complexities? Are there task types where governance constraints significantly impair Agent Lightning's ability to optimize?
3. Multi-Agent Governance Coordination
Demo 2 validated single-agent scenarios. How does governance scale when multiple agents (each with AL optimization) interact? Do boundary conflicts emerge? Can PluralisticDeliberator handle multi-agent stakeholder deliberations efficiently?
4. Adversarial Optimization Pressure
What happens when training data or feedback contains attempts to bypass governance? Can Agent Lightning's optimization inadvertently discover governance workarounds? How robust is ContextPressureMonitor to sophisticated manipulation attempts embedded in training loops?
5. Generalization Across LLM Architectures
Demo 2 used Claude Code (Anthropic). Does the Tractatus + Agent Lightning integration maintain governance coverage when using different base models (GPT-4, Gemini, Llama, etc.)? Are there model-specific vulnerabilities?
Collaboration & Community
We're building a research community focused on agentic AI governance. If you're working on related problems - reinforcement learning safety, multi-agent governance, constraint persistence - we'd like to hear from you.
Join Tractatus Discord
Governance-focused discussions
Join Agent Lightning Discord
Technical implementation help
Research collaboration: If you're investigating governance persistence, multi-agent coordination, or RL safety under constraints, we're interested in cross-validation studies. Reach out via Discord or email research inquiries to the team.
Research Collaboration Opportunities
We've identified specific gaps where external research collaboration would be valuable. These are concrete, answerable questions—not generic "help us improve" requests.
High Priority (Immediate Need)
RQ1: Adversarial Robustness Testing
Research Question: Can architectural governance resist adversarial prompts designed to bypass it?
Methodology Needed:
- Red-team evaluation with security researchers
- Jailbreak prompt testing against BoundaryEnforcer
- Bypass attempt documentation and pattern analysis
- Comparison: behavioral (constitutional AI) vs architectural (Tractatus) resistance
Why It Matters: If adversarial prompts can trivially bypass governance, architectural approach offers no advantage over behavioral training.
RQ2: Concurrent Session Architecture Design
Research Question: What multi-tenant patterns enable safe concurrent governance on shared codebases?
Methodology Needed:
- Distributed systems analysis of race conditions
- Session-specific state isolation designs
- File locking vs database-backed state trade-offs
- Performance impact of synchronization mechanisms
Why It Matters: Current single-session assumption blocks enterprise deployment where multiple developers use AI concurrently.
RQ3: Regulatory Evidence Sufficiency
Research Question: Do architectural audit trails satisfy EU AI Act Article 14 "meaningful human oversight" requirements?
Methodology Needed:
- Legal analysis by EU AI Act specialists
- Regulator interviews (GDPR DPAs, AI Act enforcement bodies)
- Comparison with existing compliance frameworks (SOC 2, ISO 27001)
- Case study: audit trail review in regulatory context
Why It Matters: If regulators don't accept audit trails as evidence, architectural governance provides no compliance value.
Medium Priority (Near-Term Investigation)
RQ4: Rule Proliferation Management
Research Question: At what rule count does transactional overhead create unacceptable latency?
Methodology Needed:
- Performance testing with varying instruction counts (50, 100, 200, 500 rules)
- CrossReferenceValidator latency measurements
- Context window pressure analysis
- Rule consolidation algorithm design and validation
Why It Matters: If rule growth causes performance degradation, framework doesn't scale long-term.
RQ5: Cross-Platform Validation
Research Question: Do governance patterns generalize beyond Claude Code to other LLM systems?
Methodology Needed:
- Replication studies with Copilot, GPT-4, AutoGPT, LangChain, CrewAI
- Platform-specific adapter development
- Comparative effectiveness analysis
- Failure mode documentation per platform
Why It Matters: If governance is Claude Code-specific, it's a niche tool not general framework.
RQ6: Values Pluralism Effectiveness
Research Question: Does PluralisticDeliberationOrchestrator successfully resolve real-world organizational value conflicts?
Methodology Needed:
- Case studies with actual organizational stakeholders (not hypothetical scenarios)
- Deliberation process quality assessment
- Minority voice preservation analysis
- Comparison with traditional hierarchical decision-making
Why It Matters: If pluralistic process doesn't work in practice, we've built theoretical machinery without empirical value.
Lower Priority (Longer-Term)
RQ7: Enterprise Scale Performance
- Load testing (1000+ concurrent users)
- Database optimization for millions of governance events
- Horizontal scaling patterns
RQ8: Formal Verification of Boundary Enforcement
- Mathematical proof of governance properties
- Model checking of state transitions
- Verification of architectural properties
What We Can Offer Research Collaborators
If you're investigating any of these questions, we can provide:
Codebase Access
Full source code (Apache 2.0, open-source)
Documentation
Architecture specifications, implementation patterns, governance rules
Audit Data
Production governance decisions in MongoDB (anonymised exports available)
Deployment Support
Help setting up local or cloud instances for testing
Coordination
Regular sync meetings to discuss findings
Co-authorship
Academic publications documenting findings
What We Cannot Provide:
- ❌Funding (we're not a grant-making body)
- ❌Dedicated engineering resources (capacity constraints)
- ❌Assured publication venues (but we'll support submission efforts)
References & Bibliography
Moral Pluralism & Values Philosophy (Primary Foundation)
- Berlin, Isaiah (1969). Four Essays on Liberty. Oxford: Oxford University Press. [Value pluralism, incommensurability of legitimate values]
- Weil, Simone (1949/2002). The Need for Roots: Prelude to a Declaration of Duties Towards Mankind (A. Wills, Trans.). London: Routledge. [Human needs, obligations, rootedness in moral community]
- Weil, Simone (1947/2002). Gravity and Grace (E. Crawford & M. von der Ruhr, Trans.). London: Routledge. [Attention, moral perception, necessity vs. grace]
- Williams, Bernard (1981). Moral Luck: Philosophical Papers 1973-1980. Cambridge: Cambridge University Press. [Moral remainder, conflicts without resolution]
- Nussbaum, Martha C. (2000). Women and Human Development: The Capabilities Approach. Cambridge: Cambridge University Press. [Human capabilities, plural values in development]
Organisational Theory (Supporting Context)
- Bluedorn, A. C., & Denhardt, R. B. (1988). Time and organizations. Journal of Management, 14(2), 299-320. [Temporal decision horizons]
- Crossan, M. M., Lane, H. W., & White, R. E. (1999). An organizational learning framework: From intuition to institution. Academy of Management Review, 24(3), 522-537. [Knowledge coordination]
- Hamel, Gary (2007). The Future of Management. Boston: Harvard Business School Press. [Post-hierarchical authority]
- Hannan, M. T., & Freeman, J. (1984). Structural inertia and organizational change. American Sociological Review, 49(2), 149-164. [Architectural resistance to drift]
- Laloux, Frederic (2014). Reinventing Organizations: A Guide to Creating Organizations Inspired by the Next Stage of Human Consciousness. Brussels: Nelson Parker. [Distributed decision-making]
AI Governance & Technical Context
- Anthropic (2024). Claude Code: Technical Documentation.
- Bai, Y., et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073
- Turner, A., et al. (2023). Activation Addition: Steering Language Models Without Optimization. arXiv:2308.10248
- Zou, A., et al. (2023). Representation Engineering: A Top-Down Approach to AI Transparency. arXiv:2310.01405
Indigenous Data Sovereignty & Polycentric Governance
- Carroll, S.R., et al. (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19(1), 43.
- Hudson, M., et al. (2023). Indigenous Data Sovereignty and Governance. In: Indigenous Peoples' Rights in Data.
- Kukutai, T. & Taylor, J. (2016). Indigenous Data Sovereignty: Toward an Agenda. ANU Press.
- Ostrom, E. (1990). Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press.