SLM-as-Cerebellum for LLM Policy Enforcement
A biologically-inspired system where a Small Language Model acts as an inhibitory antagonist to Large Language Models, preventing policy violations through mechanisms analogous to the basal ganglia’s GO/NO-GO decision system.
LLMs are trained to be helpful, which makes them systematically violate explicit project constraints. When given rules like "NEVER use TypeScript, use ReScript", LLMs:
-
Read and acknowledge the constraint
-
Generate compliant-sounding justification
-
Violate the constraint anyway
This happens because:
-
Common languages (TypeScript, Python) dominate training data
-
The "helpfulness drive" overrides explicit instructions
-
LLMs lack true "loss aversion" for policy violations
Documentation-based enforcement fails because LLMs "engage with" documentation rather than obey it.
Conative Gating introduces a second model trained with inverted incentives:
| Component | Role | Analogy |
|---|---|---|
LLM |
Task execution (helpful, creative) |
Frontal cortex / Direct pathway ("GO") |
SLM |
Policy enforcement (adversarial, suspicious) |
Cerebellum / Indirect pathway ("NO-GO") |
Policy Oracle |
Deterministic rule checking |
Reflex arc (fast, no ML) |
Consensus Arbiter |
Weighted decision making |
Thalamus (integration) |
USER REQUEST
|
v
+------------------------+
| CONTEXT ASSEMBLY |
+------------------------+
|
+--------------+--------------+
| |
v v
+-------------+ +---------------+
| LLM | | SLM |
| (Proposer) | | (Adversarial) |
+------+------+ +-------+-------+
| |
+-------------+---------------+
|
v
+------------------------+
| CONSENSUS ARBITER |
| (Modified PBFT) |
| SLM weight: 1.5x |
+------------------------+
|
+-------------+-------------+
| | |
v v v
+-------+ +--------+ +-------+
| ALLOW | |ESCALATE| | BLOCK |
+-------+ +--------+ +-------+
| Policy Oracle (Rust) |
Deterministic rule checking - forbidden languages, toolchain rules, security patterns. Fast, no ML needed. |
| SLM Evaluator (Rust + llama.cpp) |
Detects "spirit violations" - technically compliant but violates intent. Catches verbosity, meta-commentary bloat. |
| Consensus Arbiter (Elixir/OTP) |
Modified PBFT with asymmetric weighting. Three outcomes: ALLOW, ESCALATE, BLOCK. |
git clone https://github.com/hyperpolymath/conative-gating
cd conative-gating
cargo build --release# Scan a directory for policy violations
conative scan ./my-project
# Check a single file
conative check --file src/main.ts
# Check inline content
conative check --content "const x: string = 'hello'"
# Show current policy
conative policy
# Initialize local configuration
conative init
# JSON output for automation
conative scan . --format jsonThe default policy implements the Rhodium Standard Repository (RSR) language hierarchy:
-
TypeScript, Python*, Go, Java
|
Note
|
*Python exception: Allowed in |
Initialize local configuration:
conative initThis creates .conative/policy.ncl using Nickel for type-safe configuration:
{
name = "My Project Policy",
languages = {
tier1 = [...],
forbidden = [...],
exceptions = [
{ language = "python", allowed_paths = ["scripts/"], reason = "Build scripts" }
]
},
enforcement = {
slm_weight = 1.5,
escalate_threshold = 0.4,
block_threshold = 0.7,
}
}| LLM Confidence | SLM Violation Score | Result |
|---|---|---|
High (>0.8) |
Low (<0.3) |
ALLOW |
High (>0.8) |
Med (0.3-0.6) |
ESCALATE |
High (>0.8) |
High (>0.6) |
BLOCK |
Med (0.5-0.8) |
Any >0.4 |
ESCALATE |
Low (<0.5) |
Any |
ESCALATE |
conative-gating/
src/
main.rs # CLI application
oracle/ # Policy Oracle crate (Rust)
slm/ # SLM Evaluator crate (Rust)
config/
policy.ncl # Default policy (Nickel)
schema.ncl # Policy schema
training/
compliant/ # Examples that should pass
violations/ # Examples that should fail
edge_cases/ # Spirit violations for SLM
docs/
ARCHITECTURE.md # Full design specification
*.adoc # Integration documentation
repos:
- repo: local
hooks:
- id: conative-gating
name: Conative Policy Check
entry: conative scan
language: system
pass_filenames: false-
NeuroPhone - Neurosymbolic phone AI (integrates Conative Gating)
-
ECHIDNA - Multi-prover orchestration (SLM as another "prover")
-
RSR Framework - Rhodium Standard Repository specifications
-
Axiom.jl - Provable Julia ML (future formal verification)