Legal work is a planning problem, not a question answering problem

Most legal AI still assumes a simple shape: document in, prompt in, answer out.

It works, up to a point. You get something coherent, often convincing, sometimes genuinely useful. That surface quality has been enough to drive adoption.

But it also hides where legal work actually succeeds or fails.

The real work is not answering a question about a document. It is deciding what needs to happen next, who is responsible, what can be relied on, and what must be checked before anything moves forward.

That is a planning problem.

The illusion of completeness

A well-formed answer creates a sense that the task is done.

In practice, it rarely is.

A clause summary does not tell you:

whether the clause matters in the current phase of the deal
whether it conflicts with another obligation
whether it has already been addressed elsewhere
whether it needs escalation before being relied on

The answer feels complete because it is self-contained. Legal work is not.

Most problems in matters are not caused by misunderstanding a clause. They come from missed steps, unclear ownership, or decisions made without full context.

Optimising for better answers does not fix that.

Where things actually break

Take a typical workflow:

A document is reviewed. Risks are identified. A summary is produced. Someone reads it and moves on.

What is missing is not more detail in the summary.

It is everything around it:

Who owns the risk that was flagged?
Has it been resolved, accepted, or deferred?
Does it affect other parts of the matter?
Should it block the next step?
Has anything changed since the summary was generated?

None of that sits inside the answer.

It sits in the coordination of the work.

This is where most systems are silent.

Documents are inputs, not the system

Legal tech has historically treated documents as the centre of gravity.

That made sense when documents were the primary artefact being worked on.

Once AI is introduced, that assumption becomes limiting.

Documents become inputs into a broader system:

tasks are created from them
obligations are tracked beyond them
decisions are made in response to them
state evolves independently of them

If the system only understands documents, it cannot understand the matter.

That gap is where risk accumulates.

The missing layer

What is missing is a system of record for the work itself.

Not just:

what documents exist
what answers have been generated

But:

what stage the matter is in
what decisions are pending
what has been agreed
what is still at risk
what is allowed to happen next

Without that, every interaction with AI starts from a partial view.

The model may be accurate within that slice, but the slice itself is incomplete.

Orchestration as a first-class concern

Orchestration is not a technical detail. It is the structure of the work.

It answers questions like:

what needs to happen before this output can be used?
who is allowed to approve it?
what context must be present for it to be valid?
what changes once it is accepted?

This sits alongside model choice, not beneath it.

A more capable model does not solve a coordination problem. It often makes it harder to spot, because the outputs look better.

Why this matters now

As AI becomes embedded in workflows, the failure modes change.

You move from:

incorrect answers

to:

correct answers used in the wrong way
incomplete outputs relied on too early
decisions made without visibility of the full state

These are harder to detect and harder to unwind.

They are also where liability sits.

Firms that continue to optimise for answer quality alone will see diminishing returns.

The gains are real, but they plateau.

The risks continue to compound.

A different way to think about it

If you model legal work as a planning problem, a different set of priorities emerges.

You start to focus on:

how tasks are defined and sequenced
how state is captured and updated
how decisions are gated and recorded
how context is carried across steps

AI still plays a role, but as part of a system rather than the centre of it.

The question shifts from: "Can the model answer this?"

to: "Should this answer be used, by whom, and what happens next?"

That is a harder question.

It is also the one that determines whether the system can be trusted.

Where this leads

Once planning becomes the focus, several things follow naturally:

matter state becomes a first-class concept, not an afterthought
evaluation moves from prompt testing to scenario testing
governance shifts from policy documents to enforced controls
routing decisions consider context, not just cost or capability

At that point, the system starts to resemble other mature engineering disciplines.

Not because legal work becomes software.

But because coordination, sequencing, and control start to matter more than individual outputs.

Closing thought

Better answers will continue to improve legal workflows.

They are not the limiting factor anymore.

The constraint is how those answers are integrated into the work.

Until that is addressed, most systems will remain impressive in isolation and fragile in practice.