The risk in legal AI often appears after the model has finished, because a draft summary sitting inside a workspace is not the same thing as a clause amendment sent to a client, a research note relied on in advice, or a workflow update that changes the live position of a matter.
Most governance discussions still focus heavily on the moment of generation: what prompt was used, which model responded, whether the answer looked plausible and whether the system included suitable warnings. Those things matter, but they do not tell the full story. The more important question is what happens next.
The output is not the risk event
An AI output can exist without being relied on. It may be a rough first pass, a private note, a draft suggestion or a discarded alternative that never leaves the workspace.
The output becomes legally and professionally significant when someone uses it. That may mean sending it to a client, inserting it into a contract, relying on it in advice, allowing it to update a workflow, triggering a notification, escalating a risk rating or moving a matter forward.
The risk is not only that the model generated something wrong. It is that weak, unchecked or unsupported output becomes part of live legal work without the evidence, review or authority needed to justify that use.
That is the practical problem last mile liability is intended to capture.
Risk starts when output becomes action
Last mile liability is the risk that arises when AI output is relied on, sent, inserted into a live document or allowed to trigger action without the right controls around it.
This matters because legal work is full of destinations, each carrying a different level of reliance and consequence. An internal note, a client email, a draft agreement, a board paper, a court filing, a regulatory response, a matter update and a task completion do not create the same professional exposure, even where the underlying text looks similar.
A system that treats those destinations as equivalent is not governing legal work. It is only governing text generation.
Agentic AI makes this harder to ignore. If a system can take the next step, not just draft the next answer, then release controls need to govern actions, tool calls and state changes as well as prose.
The destination changes the duty
A summary used by a lawyer to understand a file may need traceability back to the source material, while a client-facing advice note may also need review by the right person, evidence of approval and a clear record of what was sent.
A proposed clause amendment may need a different gate again, because it may have to be checked against negotiation position, client instructions, risk tolerance, precedent policy and the surrounding drafting.
The same output can carry different risk depending on where it goes and how it is used, which is why governance should not stop at model selection or prompt control. It has to understand destination.
Prompt logs are not enough
Prompt logs can show what was asked and what came back, but they do not always show whether the output was used appropriately or whether the final action was justified.
They may not answer the questions that matter most:
- Was the output used internally or externally?
- Was it relied on as advice or treated as a draft?
- Was the reviewer suitably qualified?
- Was the relevant source material checked?
- Did the output reflect the client’s latest instructions?
- Was a policy exception approved?
- Was the final version materially different from the AI-generated version?
A legal AI audit trail needs to be connected to action. Otherwise, it becomes a record of generation rather than a record of accountability.
Execution gates belong at the point of use
Execution gates are the practical response to last mile liability because they create a required pause before output reaches a destination. That pause may require review, approval, evidence, escalation or a block, depending on the consequence of use.
The point is not to slow every workflow down. The point is to match control to consequence. A low-risk internal note may need no gate at all, while a client-facing email may require lawyer review, a draft clause amendment may need approval against negotiation policy, and a regulatory response may require senior sign-off with a fuller evidence record.
The gate should be triggered by intended use, not only by the prompt that created the text.
What the system should record
A useful system should record more than the model response. It should capture the route from output to use, including:
- what the task was
- what context was used
- which model, tool or human route handled it
- what output was produced
- where the output was intended to go
- what reliance level applied
- which gate was required
- who reviewed or approved it
- what evidence supported release
- whether the output was changed before use
- whether the matter state changed as a result
This does not need to become a heavy compliance exercise for every small action. The point is proportionality: higher reliance and higher consequence should create a stronger record.
The practical test
The practical test is not simply whether the AI produced a good answer. The better test is whether the firm could explain why that output was allowed to be used in that way, for that matter, by that person, at that point in the workflow.
That question forces the system to connect generation, review, destination and accountability. It also exposes where many current tools are weak, because they can produce, summarise and draft, but often do not know enough about the matter, the user, the destination or the required authority to decide whether the output should move forward.
That is the gap last mile liability points to. Legal AI governance should not be designed only around preventing bad outputs. It should be designed around controlling the moment those outputs become action.