Inner Sage
Case Study

This case study is password protected.
Enter the password to continue.

Case Study

Inner Sage

Designing a therapeutic AI from the inside out

Healer Jenn Morse has spent decades developing a four-stage therapeutic framework centered on emotional regulation, belief reframing, and somatic awareness. Inner Sage was the attempt to bring that methodology to scale — an AI product for high-functioning individuals who want the depth of a therapeutic process without the barrier of 1:1 sessions.

The founding team had the methodology, the technical infrastructure, and the vision. What they didn't have was a way to turn a nuanced human framework into something an LLM could faithfully inhabit without flattening it.

Role
Conversation Designer & AI Product Architect
Scope
Architecture, persona, state machine, eval strategy
Timeline
Ongoing — alpha May, beta June 2026
Model
GPT o4-mini with structured prompt layering

A framework that had never been made machine-legible

Jenn Morse's six-stage framework — emotional regulation, belief reframing, somatic awareness — works because it's sequenced and because it demands real presence. You can't skip stages. You can't rush the body. Inner Sage was the bet that this could scale: an AI product for high-functioning people who want something with genuine depth, not a mood journal with a chatbot bolted on.

The team came in with the methodology, the infrastructure, and a clear vision. What nobody had figured out yet was the translation layer — how to take something that lives in the relational, embodied space of human therapeutic practice and make it operational inside a language model without losing what made it work.

"The risk wasn't that Madeline would say something wrong. It was that she'd say something warm, well-phrased, and completely unrecognizable to Jenn's methodology. Good-sounding and therapeutically meaningless aren't the same failure. Only one shows up in testing."

My job was to sit between Jenn's framework and the model and translate it — not summarize it, not flatten it into bullet points the engineers could hand off. Actually translate it, in the linguistic sense: find the equivalent structure, not the nearest approximation.

Everyone was thinking at the prompt level. That was the problem.

Early conversations kept circling the same questions: what does Madeline ask, what does she say when someone discloses something hard, how does she handle silence. Legitimate questions, all of them — but downstream of something nobody had named yet. What does this system actually know about where a user is? And what is it allowed to do from there?

Problem

No state awareness

Without state awareness, the AI had no way to tell whether to explore, consolidate, or transition. So it guessed — and the guesses were different every session.

Problem

Methodology as vibes

The framework existed as documents — detailed, clinically precise, completely un-operationalizable. There was no mechanism to say: in this phase, these moves are permitted. In this one, they're not.

Solution

12-phase state machine

Twelve named phases, sequenced, each with defined entry conditions and constrained exits. Madeline always knows where she is in the arc — and what she's allowed to do from there.

Solution

Three-layer document library

Hard-coded instructions, vectorized methodology, and team strategy each live in separate layers. Jenn can revise the clinical content without touching behavioral rules. Engineering can update the logic without touching the content. Neither has to ask the other's permission.

The 12-phase Master Conversation Arc

The state machine is what gives Madeline her spine. Every phase has a name, a therapeutic purpose, defined conditions for entering it, and constrained paths out. An AI without this just drifts — producing responses that feel coherent in isolation and make no sense as a sequence. With it, Madeline always knows where she is. She knows what she's allowed to do next. She can't improvise her way past a gate she hasn't cleared.

Click any phase to see what it does and what moves it permits.

State Machine — Master Conversation Arc Interactive
Inner Sage 12-Phase State Machine An interactive diagram showing the 12 phases of Madeline's therapeutic conversation arc, organized into four named clusters: Sanctuary, Anchor, Alchemist, and Celebration. SANCTUARY ANCHOR ALCHEMIST CELEBRATION re-reg AM / PM logic P01 Welcome Orient & attune the user P02 Presence Establish felt safety P03 — REGULATION GATE Somatic Check-In Assess nervous system state ⬡ C_state gate P04 Grounding Body-based stabilization P05 Inquiry Open exploration of what's arising P06 — CLARITY GATE Pattern Recognition Surface underlying belief ⬡ C_belief gate P07 Reframe Challenge limiting belief P08 Integration Embody the new perspective P09 — READINESS GATE Action Readiness Is change felt, not just known? ⬡ C_readiness gate P10 Commitment Anchor new behavior P11 Acknowledgment Name and witness growth P12 — SESSION CLOSE Bridge & Return Set intent; close the loop → next session seeding ONBOARDING (deterministic, no AI) → next session begins at P01 with session memory
Sanctuary — regulation
Anchor — inquiry
Alchemist — transformation
Celebration — integration
Consent gate
Select a phase above to see its therapeutic function and conversational constraints.

The calls that actually shaped the product

The choices that shaped this product were mostly about permission — what Madeline's allowed to do, under what conditions, and what the system does when those conditions aren't met. A few of those calls defined everything that came after.

Decision Options Considered What We Chose Why
Confidence scoring model Continuous multi-dimensional score (regulation × clarity × readiness) Three binary consent gates (C_state, C_belief, C_readiness) A continuous score sounds more rigorous. It's also nearly impossible to test, explain to a clinician, or debug when it misbehaves. Binary gates are blunt by design — and they work. False precision at MVP stage costs more than it buys.
Methodology encoding Full framework in the system prompt Vectorized content separated from hard-coded instructions When instruction and methodology live in the same document, every content update is a potential behavior change — and you won't always know which one you triggered. Separation means Jenn can evolve the clinical framework without touching the behavioral layer, and vice versa.
Onboarding flow AI-guided onboarding from session one Deterministic non-AI onboarding Onboarding is where consent gets captured, safety signals get recorded, and baseline preferences get set. None of that should be improvised. A language model that hallucinates through onboarding isn't creating a UX problem — it's creating a clinical one.
Re-regulation handling Complete session restart on dysregulation Return arc to P01-P03 with session context retained Throwing away a user's context the moment they get activated isn't protecting them — it's abandoning them at the worst possible moment. The model loops back, but it doesn't forget. The thread stays intact.
Check-in timing Single daily check-in cadence AM/PM differentiated logic A nervous system at 7am isn't doing what it's doing at 9pm. Morning sessions call for intention-setting; evening ones for processing. Madeline needed different behavioral defaults for each — not just different language.

What the work actually produced

The deliverable wasn't a prompt library. It was a behavioral operating system — something the engineering team could build against with confidence, and the clinical team could actually read and trust.

12
Named therapeutic phases with defined entry conditions, exit gates, and permitted conversational moves
3
Document library layers — hard-coded, vectorized, and strategy — each independently maintainable
3
Binary consent gates replacing a complex continuous scoring system, without losing clinical fidelity
1
AI persona — Madeline — with defined voice, values, edge case behavior, and explicit refusal logic
Re-regulation paths designed so dysregulation during a session never discards a user's context
0
AI involvement in onboarding — a deliberate choice to protect consent integrity and safety data reliability

What I'd do differently

Earlier evaluation scaffolding

I built the architecture before I built any way to test it. A rough LLM-as-judge eval framework from the start — even a scrappy one — would've grounded architectural decisions that stayed theoretical for longer than they needed to.

The confidence scoring pivot

I spent weeks designing the continuous multi-dimensional scoring model before landing on binary gates. The simplification was obviously right — in retrospect, it was always the answer. I should've started there and made the case for complexity only if the simple version failed.

Making the state machine visual earlier

A diagram like this one, produced in week two instead of month four, would've saved weeks of misaligned back-and-forth between clinical and technical stakeholders. People argue less about phase boundaries when they're both looking at the same picture.

Documentation as a product

The Master Document Library ended up as the shared contract between design, engineering, and clinical — the artifact everyone pointed at when they disagreed. It only got there because documentation was scoped as a deliverable from day one, not assembled after the fact from notes and Slack threads.

Other Case Studies