You Don’t Have a Prompt Problem—You Have a Memory Problem

Marcia Coulter

4/29/20262 min read

black blue and yellow textile
black blue and yellow textile

Most teams working with AI assume they have a prompt problem.

If the output is off:

  • tweak the prompt

  • add constraints

  • add examples

Repeat until it “works.”

And sometimes it does.

But then you start seeing things like:

  • The same prompt behaves differently across sessions

  • A fix works, then quietly stops working

  • Outputs drift as you expand the system

  • Teammates can’t reproduce what you’re seeing

At that point, it’s worth asking:

Is this really a prompt problem?

Where It Starts to Break

Take something simple: a support ticket classifier.

You define categories. You write a prompt. You refine it as edge cases come in.

It feels like normal engineering work.

Until you hit conflicts.

A ticket mentions two categories.
You add a rule to prioritize one.

That works—until a slightly different phrasing shows up.
Then it flips again.

So you add another rule.

Now you have rules layered on top of rules…
but no way to tell:

  • which rule actually applied

  • whether the model is following them consistently

  • or why the same input produces different results later

The Debugging Problem

At some point, you try to debug it properly.

You log inputs and outputs.
Maybe you store a few examples.

But that only gets you so far.

Because what you don’t have is:

  • the assumptions the model used

  • the constraints it actually applied

  • the path it took to reach a decision

So when something breaks, you’re stuck with:

“It worked before. Now it doesn’t.”

And no clear reason why.

Prompts Don’t Give You State

In traditional systems, you don’t rely on instructions alone.

You rely on state:

  • decisions persist

  • behavior is traceable

  • results are reproducible

With most AI systems, none of that is guaranteed.

You have:

  • prompts

  • responses

But not the reasoning in between.

And that’s the problem.

What This Leads To

Without persistent reasoning:

  • Fixes don’t reliably stick

  • Edge cases reappear

  • Behavior changes across sessions

  • Debugging becomes guesswork

You end up re-solving the same problems over and over.

The Shift

At some point, the question changes.

Instead of:

“What’s the right prompt?”

It becomes:

“What would it take for this system to remember how it got here?”

Not the output.

The reasoning.

What’s Missing

If you wanted to stabilize this, you’d need to capture things like:

  • the rules you introduced

  • the assumptions behind them

  • how conflicts were resolved

  • what changed over time

And you’d need to be able to:

  • revisit that

  • inspect it

  • reuse it

  • extend it

Across sessions.

Across people.

Where This Points

That’s not a prompt tweak.

That’s a missing layer.

A layer where reasoning doesn’t disappear after each interaction—
where it can be carried forward, inspected, and reused.

In other words:

a Durable Reasoning Layer™.