Who it is for
Teams responsible for how AI is actually used, not just approved.
If you’re dealing with multiple tools, multiple models, or competing pressures from cost, speed and risk, this is the layer you’re already operating in, whether you’ve named it or not.
The real problem
Most firms don’t have a routing layer. They have:
- Default tools that get overused
- Workarounds that become standard practice
- “Trusted” individuals making judgement calls no one else can reproduce
That works until it doesn’t.
When something goes wrong, the question is simple: why did this task go through that path?
If the answer is “that’s what we usually do”, you don’t have a system. You have drift.
What a routing layer actually is
A routing layer is not model selection.
It is a policy system that decides:
- which tasks are allowed to run
- where they are allowed to run
- under what constraints
- with what level of oversight
- and who is accountable for the outcome
It sits between user intent and execution.
Done properly, it becomes the control point for cost, risk and consistency.
Start with task shape, not tools
Most implementations start with tools. That’s the mistake.
You need to define tasks in a way that reflects legal work as it actually happens:
- Clause extraction from known document types
- Open-ended research across uncertain sources
- Drafting with direct client reliance
- Internal summarisation with no downstream exposure
These are not the same problem. Treating them as interchangeable is how routing breaks down.
A usable taxonomy is:
- Deterministic vs interpretive
- Closed corpus vs open world
- Internal vs external consumption
- Low vs high reliance
If your taxonomy can’t distinguish these cleanly, your routing decisions will collapse under edge cases.
Add policy dimensions that force trade-offs
Routing only becomes meaningful when it forces explicit trade-offs.
At a minimum:
-
Sensitivity What is the data exposure risk? Client confidential, restricted, public?
-
Destination Where does the output go? Internal note, client deliverable, system of record?
-
Reliance Will someone act on this without re-checking?
-
Urgency Is speed critical, or is there time for layered review?
These are not labels. They are constraints.
For example:
-
High sensitivity + external destination + high reliance → eliminates most cloud models, introduces mandatory review, may require private inference
-
Low sensitivity + internal + low reliance → allows cheaper models, faster paths, minimal oversight
This is where routing starts to control cost and risk in a real way.
Define allowed paths, not preferred tools
Once tasks and constraints are clear, define paths, not tools.
A path is a combination of:
- Model tier (small local, mid-tier hosted, frontier)
- Execution environment (on-device, private cloud, public API)
- Retrieval approach (none, constrained RAG, open search)
- Oversight (none, sampling, mandatory human review)
- Block conditions (when the task should not proceed)
Example:
“Clause extraction, low sensitivity, internal use” → Small model, local or low-cost hosted, no retrieval, no review
“Drafting client-facing advice, high sensitivity, high reliance” → Restricted model set, private environment, structured inputs, mandatory human sign-off
This framing removes ambiguity. It also makes it auditable.
Decision rights are where most systems fail
Even well-designed routing matrices fail because no one defines who can change them.
You need explicit answers to:
- Who owns the routing policy?
- Who can approve a new path?
- Who can grant an exception?
- Who signs off on high-risk categories?
Without this, exceptions become the default.
In practice:
- Policy ownership should sit across legal, risk and engineering
- Exceptions should be time-bound and named
- High-risk changes should require dual approval
If a partner can override routing without traceability, your routing layer is cosmetic.
Versioning is not optional
Routing decisions change over time. Models improve, costs shift, regulations tighten.
If you don’t version your routing policy:
- You can’t explain past decisions
- You can’t demonstrate improvement
- You can’t isolate where something went wrong
Each version should capture:
- Task definitions
- Policy dimensions
- Allowed paths
- Decision rights
And critically:
- What changed, and why
Minimum artefacts
If you don’t have these, you don’t have a routing layer:
-
Routing matrix Task × sensitivity × destination × reliance → allowed paths
-
Exception register Owner, scope, justification, expiry
-
Policy change log Versioned updates with rationale
-
Execution logs tied to decisions Not just prompts and outputs, but which route was taken and why
30-day rollout that actually works
Week 1 Map the top five task types that generate real volume or risk. Ignore edge cases.
Week 2 Draft a routing matrix with legal, risk and engineering in the same room. Force decisions.
Week 3 Run scenarios. Break it deliberately. Find where the policy is vague or contradictory.
Week 4 Publish v1 with named owners, explicit constraints and an exception process.
Do not aim for completeness. Aim for clarity on the highest-impact paths.
Where most teams go wrong
- They optimise for flexibility instead of control
- They treat routing as a UX feature rather than a governance layer
- They log outputs but not decisions
- They assume cheaper models will stay cheap
- They avoid defining “blocked” states
A routing layer that never says “no” is not a routing layer.
Checklist
- Task taxonomy reflects real legal work, not tool categories
- High-risk combinations are explicitly constrained or blocked
- Private inference conditions are defined, not implied
- Decision rights are named and enforced
- Exceptions are time-bound and auditable
- Routing decisions are logged alongside outputs
Related
Use the Routing Simulator to pressure-test policy choices before they are adopted. Treat it as a safe environment to explore failure modes, not just validate happy paths.