Case Study

Inner Sage

Designing a therapeutic AI from the inside out

Healer Jenn Morse has spent decades developing a four-stage therapeutic framework centered on emotional regulation, belief reframing, and somatic awareness. Inner Sage was the attempt to bring that methodology to scale — an AI product for high-functioning individuals who want the depth of a therapeutic process without the barrier of 1:1 sessions.

The founding team had the methodology, the technical infrastructure, and the vision. What they didn't have was a way to turn a nuanced human framework into something an LLM could faithfully inhabit without flattening it.

Role

Conversation Designer & AI Product Architect

Scope

Architecture, persona, state machine, eval strategy

Timeline

Ongoing — alpha May, beta June 2026

Model

GPT o4-mini with structured prompt layering

01 — Context

A framework that had never been made machine-legible

Jenn Morse's six-stage framework — emotional regulation, belief reframing, somatic awareness — works because it's sequenced and because it demands real presence. You can't skip stages. You can't rush the body. Inner Sage was the bet that this could scale: an AI product for high-functioning people who want something with genuine depth, not a mood journal with a chatbot bolted on.

The team came in with the methodology, the infrastructure, and a clear vision. What nobody had figured out yet was the translation layer — how to take something that lives in the relational, embodied space of human therapeutic practice and make it operational inside a language model without losing what made it work.

"The risk wasn't that Madeline would say something wrong. It was that she'd say something warm, well-phrased, and completely unrecognizable to Jenn's methodology. Good-sounding and therapeutically meaningless aren't the same failure. Only one shows up in testing."

My job was to sit between Jenn's framework and the model and translate it — not summarize it, not flatten it into bullet points the engineers could hand off. Actually translate it, in the linguistic sense: find the equivalent structure, not the nearest approximation.

02 — Design Challenge

Everyone was thinking at the prompt level. That was the problem.

Early conversations kept circling the same questions: what does Madeline ask, what does she say when someone discloses something hard, how does she handle silence. Legitimate questions, all of them — but downstream of something nobody had named yet. What does this system actually know about where a user is? And what is it allowed to do from there?

Problem

No state awareness

Without state awareness, the AI had no way to tell whether to explore, consolidate, or transition. So it guessed — and the guesses were different every session.

Problem

Methodology as vibes

The framework existed as documents — detailed, clinically precise, completely un-operationalizable. There was no mechanism to say: in this phase, these moves are permitted. In this one, they're not.

Solution

12-phase state machine

Twelve named phases, sequenced, each with defined entry conditions and constrained exits. Madeline always knows where she is in the arc — and what she's allowed to do from there.

Solution

Three-layer document library

Hard-coded instructions, vectorized methodology, and team strategy each live in separate layers. Jenn can revise the clinical content without touching behavioral rules. Engineering can update the logic without touching the content. Neither has to ask the other's permission.

03 — Architecture

The 12-phase Master Conversation Arc

The state machine is what gives Madeline her spine. Every phase has a name, a therapeutic purpose, defined conditions for entering it, and constrained paths out. An AI without this just drifts — producing responses that feel coherent in isolation and make no sense as a sequence. With it, Madeline always knows where she is. She knows what she's allowed to do next. She can't improvise her way past a gate she hasn't cleared.

Click any phase to see what it does and what moves it permits.

State Machine — Master Conversation Arc Interactive

Sanctuary — regulation

Anchor — inquiry

Alchemist — transformation

Celebration — integration

Consent gate

Select a phase above to see its therapeutic function and conversational constraints.

04 — Key Decisions

The calls that actually shaped the product

The choices that shaped this product were mostly about permission — what Madeline's allowed to do, under what conditions, and what the system does when those conditions aren't met. A few of those calls defined everything that came after.

Decision	Options Considered	What We Chose	Why
Confidence scoring model	Continuous multi-dimensional score (regulation × clarity × readiness)	Three binary consent gates (C_state, C_belief, C_readiness)	A continuous score sounds more rigorous. It's also nearly impossible to test, explain to a clinician, or debug when it misbehaves. Binary gates are blunt by design — and they work. False precision at MVP stage costs more than it buys.
Methodology encoding	Full framework in the system prompt	Vectorized content separated from hard-coded instructions	When instruction and methodology live in the same document, every content update is a potential behavior change — and you won't always know which one you triggered. Separation means Jenn can evolve the clinical framework without touching the behavioral layer, and vice versa.
Onboarding flow	AI-guided onboarding from session one	Deterministic non-AI onboarding	Onboarding is where consent gets captured, safety signals get recorded, and baseline preferences get set. None of that should be improvised. A language model that hallucinates through onboarding isn't creating a UX problem — it's creating a clinical one.
Re-regulation handling	Complete session restart on dysregulation	Return arc to P01-P03 with session context retained	Throwing away a user's context the moment they get activated isn't protecting them — it's abandoning them at the worst possible moment. The model loops back, but it doesn't forget. The thread stays intact.
Check-in timing	Single daily check-in cadence	AM/PM differentiated logic	A nervous system at 7am isn't doing what it's doing at 9pm. Morning sessions call for intention-setting; evening ones for processing. Madeline needed different behavioral defaults for each — not just different language.

05 — What It Produced

What the work actually produced

The deliverable wasn't a prompt library. It was a behavioral operating system — something the engineering team could build against with confidence, and the clinical team could actually read and trust.

Named therapeutic phases with defined entry conditions, exit gates, and permitted conversational moves

Document library layers — hard-coded, vectorized, and strategy — each independently maintainable

Binary consent gates replacing a complex continuous scoring system, without losing clinical fidelity

AI persona — Madeline — with defined voice, values, edge case behavior, and explicit refusal logic

∞

Re-regulation paths designed so dysregulation during a session never discards a user's context

AI involvement in onboarding — a deliberate choice to protect consent integrity and safety data reliability

06 — What I'd Do Differently

What I'd do differently

Earlier evaluation scaffolding

I built the architecture before I built any way to test it. A rough LLM-as-judge eval framework from the start — even a scrappy one — would've grounded architectural decisions that stayed theoretical for longer than they needed to.

The confidence scoring pivot

I spent weeks designing the continuous multi-dimensional scoring model before landing on binary gates. The simplification was obviously right — in retrospect, it was always the answer. I should've started there and made the case for complexity only if the simple version failed.

Making the state machine visual earlier

A diagram like this one, produced in week two instead of month four, would've saved weeks of misaligned back-and-forth between clinical and technical stakeholders. People argue less about phase boundaries when they're both looking at the same picture.

Documentation as a product

The Master Document Library ended up as the shared contract between design, engineering, and clinical — the artifact everyone pointed at when they disagreed. It only got there because documentation was scoped as a deliverable from day one, not assembled after the fact from notes and Slack threads.

Other Case Studies

RITA Group / Mondai

Personalization logic • Career navigation AI

→

Sound.fan

Conversation design • Fan engagement platform

→