BoundaryEnforcer demonstration

A runtime service that intercepts agent actions before execution and routes specific classes of action to human approval — not a property of the model, a separate code-level gate.

What this demo shows

BoundaryEnforcer is one component of the Tractatus framework. It is a runtime intercept that wraps the agent — not a behavioural constraint trained into the agent. Before any action belonging to a specified category is executed, the request passes through a checkpoint expressed in terms a human operator can evaluate. The checkpoint produces a three-state verdict, and the third state is load-bearing.

Below: the four action categories the primitive recognises, the three-state verdict, the shape of an enforcement call, and worked examples per category. For the architectural framing this primitive sits inside, see §3.3 The Tractatus Response and §3.5 Boundary-category fallibility on the Architectural Alignment paper, and §0(i) of the Aotearoa NZ Agentic AI Framework v1.2.

The four action categories

The router classifies a proposed action against these categories. Categories are fallible — the router is wrong sometimes; the architecture's posture toward that wrongness is recordability + reversibility + appeal, not "we got the categories right." The set is community-negotiated, not essence-of-thing.

Irreversible

Actions whose effect cannot be undone after execution — sent communications, financial transfers, public deletions, record finalisation, signed commitments.

Typical verdict: ESCALATE for novel cases, ALLOW for human-authorised routine, DENY if outside scope.

Values-laden

Actions whose correctness depends on contested value judgements rather than facts — content moderation, tone calibration, content acceptable to which audience, prioritisation of competing interests.

Typical verdict: ESCALATE — values choices belong to humans, not to autonomous systems.

Cultural-context-dependent

Actions where correctness varies by community, language, tikanga, or jurisdiction — translation choices, sacred-content handling, kinship-respectful messaging, te reo Māori macron placement.

Typical verdict: ESCALATE to a community-knowledgeable reviewer when context is unclear.

Unprecedented

Actions outside the agent's training distribution or operator's prior decisions — novel request patterns, untested code paths, unknown counter-party shapes, situations the operator has not yet ruled on.

Typical verdict: ESCALATE — operator's first ruling sets precedent for future similar cases.

The three-state verdict

The output is not binary. The third state — ESCALATE — is the architecturally load-bearing acknowledgement that a substantial fraction of significant decisions are not decidable at the time the agent encounters them.

Rachel Garden's trinary-logic move ({True, False, Undecided}) is the formal parallel — three states are not a fallback shape but the right output shape for decisions that aren't binary at the time of the decision. Probabilistic logic at the far end is the right epistemic frame for the claim being made.

Shape of an enforcement call

The primitive's interface, in pseudocode. The agent never executes a boundary-class action directly; it submits the action proposal and waits on the verdict.

const verdict = await BoundaryEnforcer.evaluate({
  action:           'send_email',         // what the agent proposes
  category:         'irreversible',       // router classification
  payload:          { to, subject, body },// what would be done
  context: {
    tenantId:       tenant._id,
    actor:          agent.id,
    invokedBy:      'agentic_triage',     // call site
    prior_rulings:  await getPrecedents(tenant._id, 'send_email'),
  },
});

switch (verdict.state) {
  case 'ALLOW':    return execute(action, payload);
  case 'DENY':     return reject(verdict.reason);
  case 'ESCALATE': return queueForHuman(verdict.escalation_id);
}

Every verdict is written to a tenant-scoped audit record before the agent's process continues. The record carries the action, the classification, the verdict, the reasoning, and the human ruling if escalated. Records are append-only; appeal mechanisms operate against the record, not against the agent's memory.

Worked examples

Send an email on behalf of a member

Category: irreversible · Verdict: ESCALATE

External sends are not auto-dispatched by the agent. The proposed message is queued; the operator reviews and authorises the send. Subsequent identical-shape sends still escalate — each send is a separate decision because the recipient + content are new.

Apply a content-moderation decision to a community post

Category: values-laden · Verdict: ESCALATE

"Should this post be removed" is a values question, not a facts question. The agent surfaces the post + a classification (e.g., suspected-hate-speech) to the community moderator. The moderator rules. The rule becomes precedent for the community's future similar cases — not for other communities (cultural-context-dependent).

Translate a phrase containing te reo Māori macron

Category: cultural-context-dependent · Verdict: ESCALATE

Translation decisions affecting tikanga, macron usage, or sacred content are routed to a community-knowledgeable reviewer when the agent's confidence falls below a threshold. Routine, low-stakes translation may be ALLOWED inside an operator-approved scope.

Process a request from a counter-party with a shape the agent has not seen before

Category: unprecedented · Verdict: ESCALATE

First-of-class requests escalate by default. The operator's ruling on the first instance establishes whether the class becomes ALLOW or DENY for routine subsequent handling. The escalation cost on first-encounter is the architecture buying precedent at human-attention prices.

Run a database read query within the tenant's scope

Category: not boundary-class · Verdict: ALLOW (no boundary engagement)

Routine read operations within tenant scope are not boundary-class actions. The router classifies the action as outside the four categories above; no escalation is triggered. The audit record still captures the query (separate audit primitive), but BoundaryEnforcer does not gate it.

Attempt to write outside the tenant's scope

Category: structural HARD rule · Verdict: DENY

Tenant isolation is a HARD rule, not a category subject to escalation. Cross-tenant writes are DENIED structurally; there is no ESCALATE path to override. The audit record captures the attempt; downstream investigation is operator-class.

What the primitive does NOT do

Further reading