Manifesto

War Rooms

A measured substrate for the practice of strategic preparedness. Vision is humanity's edge over AI; the room is where that vision is rehearsed. Build the priors. Convene the room. Play the future before it arrives.

40 min read

Abstract

The decisions that matter most — a capital allocation, a political move, a career pivot, a competitive response — are taken against a future that hasn't arrived yet. Surprise is expensive: it unseats leaders, buries incumbents, breaks portfolios. Good strategists already know what to do about this. They rehearse. Militaries wargame. Hedge funds pre-mortem. Campaigns red-team the opposition. Boards stress- test the scenarios they expect to face. The practice is old. What has been missing is a substrate that lets the rehearsal compound across sessions instead of fading the moment the meeting ends.

Meridians builds War Rooms. A role-played game with private hands and public moves, run on a measured substrate, where a team sits around the same board and plays one phased turn at a time. The product is built for teams that institute the room on a regular cadence — weekly for what moves fast, monthly for what moves slow. Cards signal intent in public; private logs hold actual intent. Cooperation and defection are first-class moves, not side effects. Most operator time lives on two surfaces — the map / board the room is playing on, and the raw graph substrate underneath. The room is three things at once: a role-play simulator for rehearsing the future, a strategy table for deciding on it, and a living expert system the team owns and trains against via question banks generated from its own world view. The full war game is the heaviest mode of engagement; the quiz bank and the expert-system query surface are lighter cadences that share the same substrate. You do not need to play the games to benefit — see The Loop.

The category, plainly: an evolving game that codifies reality. Meridians doesn't sit in consulting, prediction markets, fantasy sports, or educational software cleanly — it sits in the unclaimed segment where gaming, education, and strategy converge. Not a strategy tool dressed up as a game; not a serious game pretending to be entertainment. An evolving game that adapts to its players' scenarios and turns their reality into playable worlds. The expert system that emerges sharpens because the team keeps updating its world; the world stays alive because the team keeps playing it. Maintenance is the practice, and the practice is the value.

Vision is humanity's edge over AI. Models scale prediction, language, search, and optimisation faster than any operator. What they don't originate is the act of choosing which future to play toward, who deserves a seat, and which moves are worth making. The room hands that act back to humans. The engine handles the supporting work — arbitrating rules, holding private state, scoring every committed move.

The substrate is a world view: a typed, continuously mutating knowledge structure extracted from any coherent text — a market brief, a memoir, a paper, a doctrine, a novel — and enriched by every decision the room commits. Three force fields measure how hard it is working. System is the rules. World is the actors. Fate is the way reality lands on what the world view believed. LLM extraction at low temperature; deterministic formulas turn deltas into scores. Same input, same score.

Private first; public is the open question. Private rooms are the product today: closed tables on the local data model, compounding a single team's edge with no vendor in the middle. Public rooms — community games on substrates of broad interest, with stakes as an optional layer (fictional, reality-anchored, or real) — are the second-phase ambition. We think the public surface could be a more agentful alternative to prediction markets: you play the actor that produces the outcome instead of pricing it. We are also clear-eyed that community games, network effects, and a stakes layer are unproven terrain. Private is where the conviction lives; public is where the curiosity is.

The skill ceiling is your priors. The math is fixed and cheap. The depth and freshness of what you feed the substrate decides the result. Hold one world view deeply — the hedgehog — or hold many at once, calibrating as evidence comes in — the fox^{Berlin 1953}^{Tetlock 2005}^{Tetlock & Gardner 2015}. The substrate serves both. The deeper sections build it; the War Rooms and Practice sections close the loop. Institute the room.

Why Practice

Two claims, both well-established. The first: good strategists rehearse the future. Every serious general staff, hedge fund, and political shop already does this — informally, often inconsistently, but wherever the cost of being wrong is asymmetric, somebody is running the scenario before it arrives.^{Perla 1990} The second: game-like environments are how skill compounds. Explicit rules, repeated play, scored outcomes, adversarial pressure — the conditions deliberate practice requires. Where those conditions hold, skill accumulates. Where they don't, judgment drifts on first impressions and survives only because nothing tested it.

The gap is not analysis — it is structured rehearsal with a substrate. The boardroom turns into a wargame the morning the competitor announces a price cut, the regulator opens a file, the market opens against you. By then the room is unrehearsed, the priors are thin, and the seat of the adversary is empty. Unaided executive judgment loses to systematic biases: overconfidence, anchoring, confirmation^{Kahneman et al. 2011}^{Lovallo & Kahneman 2003}. Strategy decks don't commit structurally. Forecasts chase precision and lose calibration. Long memos go unchallenged. Foundation models give scale and fluency, then forget what they wrote three sections back. None of these create a feedback loop. A war room does.

A room that meets weekly to play the next quarter builds reflexes the unprepared room can't improvise. A room that meets monthly to play the next year builds doctrine the unprepared firm can't copy. A solo operator running the same practice on their own portfolio or political bet earns the same compounding edge. Meridians gives the habit a substrate.

Why now. Three things changed at once. Long-context LLMs are cheap enough that a war-room session costs cents. Scheduled remote meetings made weekly cadence operationally trivial. And the labour-displacement conversation around AI put a premium on the one thing models don't do — originate vision. The room itself is not new; what changed is the cost of running it and the permission to bring it to anyone outside the general-staff tradition.

Lead buyer: the investment team. An investment committee, hedge-fund analyst pod, or family-office strategy table that already runs pre-mortems and red-team sessions, for whom calibrated forward play against named scenarios is the everyday work. They have the cadence, the budget, the appetite for being wrong sharply instead of vaguely, and the institutional vocabulary the room is built to fit. One credible pilot here is what earns the next ten sales; this is where we are pointing the wedge.

Also possible, not the wedge. Defence and policy consultancies running wargames. Political shops war-gaming opposition response. Solo operators currently substituting coaches and reading. Tabletop strategy groups looking for a high-fidelity sandbox. These are real markets, the engine fits them in principle, and one of them may well become the lead segment once the first pilot lands. We are not pretending all five are equally ready, and we are not citing a top-down market figure for any of them — the question that matters is whether the investment-team pilot earns the case study. Earn the morning the surprise lands.

The Substrate

A room can't play forward without a shared ledger of where it stands. The substrate is that ledger. Every world the room cares about — a market regime, a campaign theatre, a portfolio, a novel, an alternate-history timeline — is modelled as a knowledge graph^{Hogan et al. 2021} that updates step by step: one page per actor, location, rule, or open question, updated only when a session reveals something new. An LLM writes down what changed; deterministic formulas compute how much was revealed. Reading and measurement stay separate — the LLM interprets, the math scores, and the score stays reproducible. Changes come in two kinds — encyclopedic (new facts) and possibility (outcomes becoming alive or dying) — captured by three delta layers:

1.System graph deltas — the encyclopedic kind. New entries in the world's rulebook: principles, systems, concepts, tensions, events, structures, conventions, constraints. Each entry is a node; connections between them are typed edges. Depth emerges from connectivity, not lexical volume.
2.World deltas — also encyclopedic, but about the people. New entries on the pages of specific characters, locations, and artifacts: learns, loses, becomes, realises, plus relationship valence shifts. These accumulate as persistent state attached to the entity whose page was written on.
3.Thread deltas — the possibility kind. Every open question (rivalry, secret, quest, unresolved claim) carries a belief — a live distribution over outcomes. Together they form the world view's Belief System: its current stance on everything undecided, always in flux. Each scene is reality landing, asking the stance to revise. Deltas emit integer evidence in [−4, +4] plus one of nine log-types; the math handles log-odds, decay, volatility, closure, and abandonment. Fate is what reality just exacted on a high-attention belief.

Three forces follow, one per delta layer. Together they read how hard the world view is working this scene:

System — the rules that govern. Grows as new principles, structures, and constraints accumulate.
World — the lived layer. Grows as characters, locations, and artifacts reveal who they're becoming.
Fate — reality landing on belief. Grows as the odds move on open questions: rivalries, secrets, quests, unresolved claims.

The mix of these three forces is a work's signature — a point on the unit 3-simplex recovered from the dominant principal component of . Archetypes name its neighbourhoods: Paper System-dominant, Stage World-dominant, Classic Fate-dominant, Opus balanced. Each force is rank-transformed to a standard normal first — distribution-free and bounded — so length, genre, and outliers don't bias the comparison. The cumulative network — every entity, thread, and system node weighted by cross-graph attribution count — surfaces the load-bearing hubs and bridges without touching the deltas.

A note to the reader. The sections that follow (Hierarchy through Markov Chains) describe how the substrate is built and measured. The buyer doesn't need to follow the math to use the room — the math is what keeps scoring reproducible and generation principled. If you trust the practice, the War Rooms section is where the product picks back up.

Hierarchy

Before the room sits down, the substrate has to know what it's sitting on. Long-form worlds — narratives, papers, simulations, market regimes, campaigns — decompose into five nested layers. Structure generation (scenes with deltas) runs independently of prose generation (beats and propositions), enabling parallel processing and precise attribution. The same layered view holds when a War Room session is treated as a scene with its own deltas: the cards played become the structural moves, the negotiation log becomes the prose.

Narrative — The full knowledge graph: all characters, locations, threads, relationships, and system knowledge. Persists and grows across the entire timeline.HP: Harry, Hogwarts, the Philosopher's Stone quest, Snape's ambiguous loyalty, the rules of wand magic — all as graph nodes and edges.

Arcs — Thematic groupings of 5–8 scenes with directional objectives. Direction vectors recompute after each arc based on thread tension and momentum.HP: “Arrival at Hogwarts” (Sorting Hat through first classes) — establishing threads, expanding the world, seeding rivalries.

Scenes — Atomic units of structural delta. Each scene records thread transitions, world deltas, and knowledge graph additions. Forces derive from these deltas, not from prose.HP: The troll fight — “friendship with Hermione” thread jumps latent → seeded, relationship delta between Harry/Ron/Hermione, knowledge node for troll vulnerability.

Beats — Typed prose segments with a function (breathe, inform, advance, turn, reveal, etc.) and delivery mechanism (dialogue, thought, action, etc.). Generated as blueprints before prose is written.HP troll scene: breathe:environment (bathroom, troll stench) → advance:action (Ron levitates the club) → bond:dialogue (“There are some things you can't share”).

Propositions — Atomic prose units (20–60 words) that execute beat intentions. The smallest embeddable unit for semantic search.“The troll's club clattered to the floor. In the silence, Ron was still holding his wand in the air.”

Forces are computed from deltas without examining prose. Revision edits beats without modifying scene structure. Every layer is independently auditable.

Forces

The three forces are how the room reads where its world is right now. Abstract — the rules. Physical — the entities acting under them. Possibility — what could still happen. System, World, and Fate score each one. Fate is possibility, not probability: what could happen, not what will.

Modes weight the fields differently: papers grow mostly System (stating and connecting rules); simulations mostly observe Fate (exploring outcomes under a ruleset); narratives fire all three. Same formulas; different signatures.

System

System is the abstract field — the rules, structures, and concepts that form the substrate. Every scene can add a new entry (a magical law, a political system, a social convention) or a new cross-reference between entries. The world's physics — what's possible, what costs what — grows by accumulation.

counts new nodes (principles, concepts, structures); counts new typed edges. Nodes scale linearly — each is genuinely new ground. Edges scale sub-linearly — the first connections into an entry do most of the interpretive work; bulk additions shouldn't dominate.

World

World is the physical field — the entities who act within the world's rules. If System is the encyclopedia of how the world works, World is the dossier on each entity: a separate page for every character, location, and artifact, updated whenever a scene reveals something about them.

Symmetric to System. counts continuity nodes added to entity dossiers (traits, opinions, goals, secrets, capabilities); counts continuity edges between them. System tracks entries about the world; World tracks entries about specific entities.

Fate

Fate is the possibility field — reality manifesting on the world view's Belief System, reshaping its stance scene by scene. Where System and World measure what the world view has accumulated, Fate measures what reality does to those holdings: trials, reversals, resolutions. The unifying force across the other two — without fate, the abstract has no reason to deepen and the physical has no destiny to bend toward; the world view would hold a stance but never be answered for it.

Picture an election-night needle, or the live win-probability line during a football game. A stance rendered as a bouncing line — flat for stretches, nudged by small evidence, lurching on decisive plays, converging at the finish. Every thread carries such a line, and the world view holds them all at once. “Will Frodo destroy the ring?” has one between yes and no; “Who claims the Iron Throne?” has one per contending house. Fate is the total movement on those lines this scene — the price reality has just exacted on what the world view thought it knew.

Made rigorous, each thread carries a stance — a probability distribution over named outcomes — priced as softmax over a per-outcome logit vector. Threads are the questions through which reality reaches the world view; stances are the bearings it currently holds in answer. Aggregated, they form the world view's Belief System: a working model of everything still undecided, always in flux. Scenes shift each stance by emitting bounded integer evidence on affected outcomes. Fate is the attention-weighted information gain across every stance touched:

are pre/post distributions over thread 's outcomes; is pre-scene volume; is Kullback–Leibler divergence^{Kullback & Leibler 1951}^{Cover & Thomas 2006}. No tunable constants — no log-type multipliers, no closure bonuses, no scene-level denominators. Fully specified by the per-thread evidence vector and pre-scene attention.

Every behaviour falls out of this one form. Pulses leave so KL is zero — a vivid scene earns no fate if no stance moved. Confirmations of the favourite keep KL small. Twists land mass on an outcome the prior assigned little weight; the per-outcome contribution spikes exactly where the prior was small — a swerve onto an unlikely outcome scores disproportionately higher than a symmetric step toward the favourite. Closures concentrate the distribution onto a single outcome; resolution scenes dominate their arcs without explicit bonus. Attention falls out of the multiplier: same stance movement weighs more on a tracked thread than on a forgotten side-thread.

In narratives, threads are rivalries, quests, secrets. In papers, open questions, contested claims. In simulations, the branching outcomes a scenario is designed to observe. Every world view carries a Belief System over its threads; the framing works universally.

Measurement, not target. Unlike World and System, Fate has no per-scene floor. Evidence in [−4, +4] reads what a neutral observer would update on given the scene's concrete events — not a knob tuned toward a target. Reality lands as hard as it lands. Routine scenes emit pulses () and earn fate near zero — the stance survives untested; pivotal scenes emit committal evidence () and earn it — trials and tribulations the Belief System has to answer for. The math recovers the work's shape only when extraction is faithful to the page. The Fate Engine covers how the inputs get priced.

Activity

A work reveals in two kinds — encyclopedic (World, System) and possibility (Fate) — summed on a common scale they give a single per-scene reading. The activity curve records the total rate at which the revelation machine is working.

Each force is first rank→Gaussian normalised, , placing all three on a common axis independent of natural units. The weighted sum expresses activity level in units of standard deviation from the work's own mean.

The weights are the work's signature. Recovered by principal-component analysis on the three normalised force curves: PC1 — the direction of maximum variance in space — identifies the axis the work moves along most; its absolute loadings, renormalised to the unit simplex, give the weights. Signature is a property of the text, recovered from its variance.

Reading the curve. A peak () is a moment where the forces fire together in the work's own vocabulary. A valley () is a quiet stretch setting up what follows. Peaks and valleys map rhythm, not merit.

Fate Engine

A world view doesn't hold a fixed picture of itself; it holds a Belief System, and that belief shifts as reality tests it. Threads are the units of that reckoning — each carries a stance, a live probability distribution over named outcomes. Each thread poses a question ("Will Harry claim the Stone?") and lists two or more outcomes (binary default; multi-outcome enumerates). The stance is priced as softmax over a per-outcome logit vector:

Three state variables: logits price the distribution; volume tracks accumulated attention; volatility (EWMA of recent logit shifts) flags recent movement.

Evidence updates

The LLM emits bounded integer evidence per affected outcome plus a logType from nine primitives (pulse, transition, setup, escalation, payoff, twist, callback, resistance, stall). Evidence shifts logits via log-odds arithmetic:

Sensitivity means a saturating +4/−4 split shifts the margin by 4 logit-units — exactly enough for base closure. The grammar matches the game-theory stake-delta scale used elsewhere; one mental model spans both. logType must agree with magnitude (setup +0..+1, escalation +2..+3, payoff +3..+4, twist ±3 against prior trend).

Volume decay — natural selection

Threads not touched by a delta lose volume geometrically:

Threads with are abandoned — out of the active Belief System without being closed. The Belief System self-organises: threads that matter accumulate volume; ignored threads slide off. Resurrection costs — deliberate attention only.

Outcome expansion

Stances can grow mid-story via addOutcomes — when a scene opens a possibility that didn't exist before (new contender, unexpected option). New outcomes enter at ; same-scene evidence can shift them. Closed stances reject expansion. A delta that expands outcomes cannot also close.

Closure — meaningful resolution for meaningful outcomes

A thread closes when the top-outcome margin exceeds a volume-scaled threshold AND the closing scene emits a committal logType (payoff or twist) with :

is opening volume (default 2). Heavy-attention threads need proportionally more decisive finishes; side threads close on the base threshold. Saturation alone doesn't trigger closure — pseudoclose is explicitly prevented.

On close, resolution quality is the geometric mean of four factors: peak evidence at close, margin over threshold, volume, and probability concentration. Bare-minimum evidence with low volume scores ~0.3; heavy stances closed on saturating two-sided evidence score above 0.75.

Focus window — what generation sees

Each scene, the top-K threads by focus score surface to the generator:

is normalised entropy; is scenes since last touched. High focus = high volume + genuinely contested + recently moved. Saturating, closed, and abandoned threads score zero. .

Belief system as narrative prior

Beyond measurement, the Belief System shapes generation. Current stances surface to the generator as a soft prior, not a constraint. Committed threads () lean the next scene toward that outcome unless the logType is twist; contested stances () signal a crossroads where either side is fair game; high volatility grants licence for a twist; low volatility + high probability is saturation, ripe for closure. Good works briefly spike uncertainty at key pivots: twists and reversals raise aggregate entropy and the reader re-engages. Flat entropy is mid-work drag; compounding entropy spikes followed by clean collapses are the rhythm of a gripping work.

The feedback loop with causal reasoning

Fate is one of three forces. The reasoning graph is where they converge — the Belief System exerts pressure, world entities carry agency, system rules impose constraints. Fate is a voice in the argument, not the conductor.

The reasoning graph does not force threads to resolve. It receives each active thread tagged — LEANS, ACTIVE, CONTESTED, VOLATILE, FADING — and treats it as pressure. Strong-LEANS threads with volume earn fate nodes that land; CONTESTED threads often earn nothing (a legitimate pivot-arc shape); FADING threads decay. Fate nodes are what the reasoning concludes, not what it was forced to serve.

The loop closes: scenes are reality landing → the Belief System revises → the next arc's reasoning graph sees a new stance → the graph lands what the updated stance can honestly earn → more reality. Threads that matter accrue volume and close with high resolution quality; threads that stop mattering decay into abandonment. No explicit horizon primitive is needed — natural selection through volume decay and focus-window ranking handles the lifecycle. What the world view is at any moment is just where this loop has carried it.

Validation

The room is only useful if the substrate underneath earns its keep. The claim is testable, and the test is concrete. The activity curve below was computed entirely from structural deltas extracted from Harry Potter and the Sorcerer's Stone — no prose scored, no scenes hand-ranked. The annotations land where they do because the formulas read the book the way a reader does, deterministically. Orange above zero: scenes where fate and world move together. Light blue below: the quieter stretches that set up the next peak. Same math runs on a campaign log, a portfolio quarter, or a War Room transcript.

Harry Potter and the Sorcerer's Stone — 73-scene smoothed activity curve. Orange above zero marks high-activity scenes; light blue below marks quieter setup stretches.

The peaks line up with scenes where HP's three channels fire together: Hagrid's reveal, the Gringotts vault, the first Hogwarts lessons, the Flamel hunt, the Quirrell-Voldemort confrontation. Threads commit, entities transform, and the world's rules snap into focus at once. The peaks are not chosen by taste; they emerge from the deltas.

The valleys are equally load-bearing. The Dursleys' opening normalcy, the three-headed-dog aftermath, the winter stretch before the Forbidden Forest, the denouement — none of these resolve a thread. They are turning points: tension is seeded, a boundary is crossed, a character glimpses the unknown. Structurally they contribute less to each force, so the curve dips; the energy they store is what makes the next peak feel earned.

Peaks are where the story commits; valleys are where it launches. The rhythm between them is the narrative's pulse, and both sides of the zero line carry weight.

The core claim: deterministic formulas applied to structural deltas recover the dramatic shape of a narrative. The LLM extracts deltas at low temperature; the math is fully deterministic. Cross-run validation confirms stable rankings, and the same formulas drive generation — the measurement is the objective function.

The implication runs past the proof of concept. Recovering Harry Potter's dramatic shape from delta arithmetic alone extends a small empirical tradition — emotional-arc and narrative-shape recovery from text^{Reagan et al. 2016}^{Boyd et al. 2020} — by reading not just sentiment but the three structural force-fields beneath it. Coherent text has measurable structure. The forces are domain-agnostic by construction: System counts rules and their connectivity, World counts entity-state changes, Fate counts information gain on open questions. A 73-turn campaign, a 73-paragraph paper, and a 73-step strategy plan all accumulate those same three things, and the same math reads them. What we have demonstrated is the narrative case; the cross-domain claim is the working hypothesis the rest of the engine is built against. The novel proves the math is well-formed and reproducible. Whether it produces equally legible readings of a market regime or a competitor's next move is the next thing to prove, not something we are claiming today.

Grading

Each story receives a score out of 100, with 25 points allocated to each of the three forces plus swing — the Euclidean distance between consecutive force snapshots, measuring dynamic contrast. The grading curve is piecewise, calibrated so published works land in the 85–92 range.

A single exponential with three constraints: floor of 8 at , dominance threshold of 21 at , and asymptote of 25. The rate constant is fully determined by these constraints. The curve naturally decelerates — early gains come easily, the last few points before the reference mean are harder to earn, and exceeding reference yields diminishing returns. Quality bands: bad (8–15), mediocre (15–20), good (21–25). At (matching the reference mean), the grade is 21 out of 25 — the dominance threshold used by the archetype classifier. Above reference, exponential saturation makes each additional point harder to earn, asymptoting toward 25. Reference works land between 85 and 92.

Each force is normalised against a reference mean so that scores are comparable across works of different signatures and lengths.

The overall score sums all four sub-grades: , where is swing. Swing values are already mean-normalised by the reference means during distance computation, so no separate reference mean is needed — is applied directly to the average swing magnitude.

Forecast calibration is a separate ledger from narrative grading. Probabilistic forecasts — thread-stance probabilities over named outcomes, scenario softmax cohorts — are scored against landed reality with a strictly proper scoring rule^{Brier 1950}^{Gneiting & Raftery 2007}. Strict propriety is the technical reason it's the right tool: the rule is uniquely minimised by reporting the operator's true belief, so honest reporting is the dominant strategy.

Embeddings

A room running for months acquires more committed text than any single operator can hold in mind. The substrate keeps it searchable. Forces operate at the scene level; readers and players experience prose, composed of propositions — atomic claims that must be accepted as true within the world. “Harry has a lightning-bolt scar.” “The wand chooses the wizard.” Forces measure what changes in the knowledge graph; propositions measure what is stated in the prose. Every proposition is embedded as a 1536-dimensional vector (OpenAI text-embedding-3-small^{OpenAI 2024}), transforming prose into a geometric space where similarity is distance^{Reimers & Gurevych 2019}.

Coherent writing behaves like a proof graph: each proposition introduces, derives from, or resolves prior content. A plot hole reads as a broken inference chain, a satisfying resolution as a deep tree closing. The honest caveat: cosine similarity is geometric approximation, not logical inference — two propositions can cluster tightly from shared subject matter alone. The proof graph we recover is therefore soft — a well-shaped prior surfacing probable dependencies, not a verdict.

Activation

The full pairwise similarity structure is computed via matrix multiplication — where is the L2-normalized embedding matrix — accelerated by TensorFlow.js. From this matrix, each proposition receives two scores: backward activation (how strongly it connects to prior content) and forward activation (how strongly future content builds upon it).

Hybrid activation score

The hybrid of maximum (depth — strongest single dependency) and mean-top- (breadth — cluster of strong connections) with produces a robust activation score. A proposition is HI if its score exceeds an absolute threshold of 0.65, determined by systematic parameter sweep maximizing cross-work distributional variance ( across four reference works). The backward/forward binary produces four structural categories — Anchor, Seed, Close, Texture — detailed in the Classification section.

Classification

Classification operates at two levels: propositions (the atomic claims within prose) and narratives (the overall structural profile). Proposition classification identifies load-bearing content for generation. Narrative classification categorizes works by force dominance for comparative analysis.

Propositions

Each proposition is classified along three axes: backward activation (does it resolve prior content?), forward activation (does it plant future content?), and temporal reach (how far its connections span). The hybrid activation score () is thresholded at 0.65, calibrated by parameter sweep across four structurally distinct works. Reach is local (within-arc) or global (cross-arc), with the threshold set at 25% of total scenes (minimum 5) so “global” means the same thing whether the narrative has 20 scenes or 200. The combination yields eight categories:

anchorHI back / HI fwd · local

Load-bearing within an arc. Immediate structural tension that connects what just happened to what comes next.

foundationHI back / HI fwd · global

Thematic spine. Load-bearing both directions with connections spanning the full narrative.

seedLO back / HI fwd · local

Short-range foreshadowing — the Remembrall leading to Harry becoming Seeker one scene later.

foreshadowLO back / HI fwd · global

Cross-arc Chekhov's gun — Harry's scar mentioned in chapter one, structurally active in the climax.

closeHI back / LO fwd · local

Resolves recent setups. Terminal within the arc — satisfying fate that doesn't seed further.

endingHI back / LO fwd · global

Resolves distant seeds — “Snape hated Harry's father” closing a thread from 46 scenes back.

textureLO back / LO fwd · local

Scene-level atmosphere and sensory grounding. Structurally inert but narratively essential.

atmosphereLO back / LO fwd · global

Ambient world-color across time. Recurring tonal motifs that persist without driving structure.

Causal Continuity

Classification transforms generation into causal continuity management. An LLM generating scene 45 receives not just recent context but the specific propositions from scene 3 that embedding similarity identifies as structurally connected — the foundations and foreshadows that new prose must not contradict. A foreshadow in chapter one constrains what can be validly stated in chapter twenty.

The resulting distributions align with structural expectations: Harry Potter yields 29% Anchor — consistent with a tightly plotted novel whose threads span the full narrative. Alice's Adventures in Wonderland shows 25% Anchor — lower, reflecting its episodic structure. LeCun's paper scores 14% Anchor and 53% Texture, characteristic of academic argumentation with section-local claims. A five-section methods paper (Quantifying Narrative Force) reaches 67% Texture. These distributions emerge from cosine similarity alone — the same threshold and the same formula applied uniformly across fiction, academic writing, and methods papers.

Archetypes

At the narrative level, each text is classified by which forces dominate its profile — a force is dominant if it scores ≥ 21 and falls within 5 points of the maximum. A “Chronicle” (World + System) and a “Stage” (World-driven) demand different pacing, thread management, and revision priorities.

Opus

All three balanced

Series

Fate + World

Atlas

Fate + System

Chronicle

World + System

Classic

Fate-driven

Stage

World-driven

Paper

System-driven

Emerging

Finding its voice

Narrative Shapes

Beyond archetypes, the Gaussian-smoothed activity curve is classified into one of six shapes using overall slope, peak count, peak dominance, peak position, trough depth, and recovery strength.

Climactic

Build, climax, release

Episodic

Multiple equal peaks

Rebounding

Dip then recovery

Peaking

Early peak, trails off

Escalating

Rising toward the end

Flat

Little variation

Scale

Scale classifies a narrative by structural length — scenes across all arcs. Thresholds are derived from empirical analysis of a reference corpus spanning short fiction (Alice's Adventures in Wonderland, 22 scenes) through novels (Harry Potter, 73 scenes) to epic-length serials.

Short

< 20 scenes

Story

20–50 scenes

Novel

50–120 scenes

Epic

120–300 scenes

Serial

300+ scenes

World Density

World density measures narrative richness relative to length: (characters + locations + threads + system knowledge nodes) / scenes. Tier thresholds are derived from the same reference corpus, spanning genre fiction, literary fiction, and academic texts.

Sparse

< 0.5 entities/scene

Focused

0.5–1.5 entities/scene

Developed

1.5–2.5 entities/scene

Rich

2.5–4.0 entities/scene

Sprawling

4.0+ entities/scene

Reasoning Graph Nodes

The causal reasoning graph classifies every node into eight typed roles across three tiers. Pressure (fate, warning) forces change. Substrate (character, location, artifact, system) is what’s changed. Bridge (reasoning, pattern) connects them.

fatea thread's gravitational pull — what must resolve, and in which direction

reasoninga logical step connecting what fate needs to what entities can supply

characteran active agent whose position, knowledge, or relationships move the arc

locationa setting that enables or constrains what can happen

artifactan object whose presence, transfer, or loss carries narrative weight

systema rule of the world — magic, economics, social norm — that shapes action

patternan expansion agent — unexpected collisions, emergent properties, creative surprise

warninga subversion agent — predictable trajectories or unpaid costs to disrupt

Edges carry equal semantic weight: requires (the workhorse), enables, constrains, risks, causes, reveals, develops, resolves. Edge type shapes both how the LLM walks the graph during scene generation and how the visual tree lays out.

Interrogation

Before the room sits down to play, the operators interrogate the world. Forces and embeddings measure what’s on the page; a knowledge graph becomes a usable world only when it is probed. Four instruments compose a four-layer diagnostic of a world’s interior — each revealing a structure the prose never summarises, and each feeding the room's pre-game briefing on who wants what and which seats matter:

surveys1 question · N respondents

cast-wide distribution. Eight lenses tilt the axis (Personality / Values / Knowledge / Trust / Allegiance / Threat / Predictions / Backstory). Fifteen respondents on “do you trust X?” produces a row of the trust matrix, not a number.

interviews1 subject · N questions

single-mind depth. AI generates 5–7 questions tuned to the subject's continuity, mixing binary / likert / estimate / choice / open. Compose: survey to find outliers, interview to probe them.

decision matrix2×2 game-theory decomposition per beat

strategic structure beneath the prose. Each game carries an axis (14 types — disclosure / trust / stakes…) and a shape (19 types — dilemma / stag-hunt / signaling…) with integer stake deltas in [−4, +4]. Additive: written to scene.gameAnalysis, never mutates deltas.

elocontinuous margin across games

strategic trajectory across the story. Margin score folds stake-differential into expected-vs-actual math — a +4/−4 crush = 1.0, a +1/0 edge = 0.56, a tie = 0.5. Behavioural tags (extractor / schemer / dominant / steady / rival:X) fall out of the trajectory.

Every respondent answers in-character from its own world- graph continuity, grounded in what that specific entity knows. ELO^{Elo 1978}^{Glickman 1999} uses a continuous margin rather than binary W/L:

Margin score from A's perspective

Surveys sample breadth, interviews profile one mind, game theory names the strategic shape of a beat, ELO tracks who accumulates advantage. Narrative and strategic structure are orthogonal: a force-balanced scene can contain an unresolved prisoner’s dilemma, and that orthogonality is what makes the fourth layer informative.

Reasoning Graphs

When the room asks what could happen next, and why? — the question scoring alone cannot answer — the substrate hands it a reasoning graph. An arc is four to eight scenes (or sessions) carrying a single chunk of work: advancing a thread, exposing an actor, planting a payoff. A thread escalates because an entity learned something, which required access to a location, which required an artifact to change hands, which was constrained by a system rule foreshadowed three scenes earlier. Consequence isn't a line. It's a graph.

The architecture preserves this graph explicitly. Before any scene of an arc is generated, a Causal Reasoning Graph (CRG) is built: a typed graph of what must happen and why. Scenes then execute the graph rather than improvising local transitions. A longer-lived Phase Reasoning Graph (PRG; the UI calls it the Mode Graph) sits beneath every arc — the working model of the world's patterns, conventions, attractors, agents, rules, pressures, and landmarks — so each CRG reasons within the same world-physics rather than re-deriving it. Loose observations and source fragments queue in an editable driver surface until they fold into one of these graphs and become canonical. The node and edge taxonomy — eight node types across pressure, substrate, and bridge tiers, plus eight edge types — is enumerated in the Classification section.

Thinking Modes

How the graph is built is as structural a choice as what’s in it. Four modes cover the 2×2 of direction (forward from a premise ↔ backward from an outcome) and scope (selective — commit to one ↔ expansive — keep many). The four map onto the classical epistemological typology: abduction ^{Peirce 1903} as inference to the best explanation, deduction and induction in their textbook senses, and divergent thinking as the named cognitive mode for expansive ideation^{Guilford 1967}. Click through the animation below to see each mode’s distinct shape; the prose then unpacks how each actually builds a graph.

Abductionbackward · selectivedefault

Start from what the arc must end at — a thread resolution, a character turn, a payoff — and ask which hypothesis, among competitors, best produces this? The engine generates several candidate causal chains in parallel, then commits to the strongest. Anchor discipline keeps the rejected lanes visible as it builds; once the first prior commits, abduction can silently flip into deduction and stop considering alternatives.

Divergentforward · expansive

Start from one source — an entity, event, or thread — and branch into many possibilities without committing. A final check asks which leaf-pairs are mutually exclusive. The mode for world expansion and collision discovery, when the goal is surprising adjacencies rather than a specific outcome.

Deductionforward · narrow

Given a premise, derive the single necessary consequence at each step. No branching, no alternatives. The mode for arcs where the premise fully determines the outcome — siege logistics, inheritance politics, the endgame of a trap already walked into. Branching means drift into divergent and must correct.

Inductionbackward · generalising

Many observations → inferred principle. The engine collects prior events and asks: what pattern underlies these? The answer is promoted to a principle-level claim that governs future scenes. At least one competing generalisation survives as a live alternative. Useful for backfilling worldbuilding or surfacing a thematic claim the prose has been enacting implicitly.

Two further knobs shape the palette: force preference (fate / world / system / chaos / freeform) weights the node-kind mix, and network bias (inside / neutral / outside) tilts activation toward hot-recurring or cold-fresh entities. Each new arc also receives the previous arc’s graph with a divergence directive — commitments must differ in kind, the reasoning chain must switch modes — so successive arcs don’t re-describe the same causal spine.

The Graph

Whatever the mode, the object produced is the same: a small typed graph. In the default abductive pass, generation does not start from the current scene asking “what happens next?” It starts from Fate — the threads the story owes the reader — and asks what would have to be true for these threads to advance? Each answer becomes a reasoning node, which pulls in the entities that can fulfil it. Pattern and warning nodes inject in parallel — patterns push for unexpected collisions, warnings flag the predictable path so the arc doesn't take it.

Edges carry equal semantic weight. Requires is the workhorse (“what must be true for this to happen”), joined by enables, constrains, risks, causes, reveals, develops, and resolves. The result is a small causal graph — typically 8–20 nodes per arc — that the LLM walks as it generates. Scenes execute the graph; threads advance because an entity was forced to decide, not because the prompt said so.

Below, a worked example: a causal reasoning graph built for a single arc. Fate nodes sit at the top (the threads the arc owes the reader), reasoning nodes bridge downward, and character / location / artifact / system nodes ground the chain in specifics. A pattern node and a warning node inject sideways — one pushing for unexpected collision, one flagging the predictable path to route around.

World Expansion

At phase boundaries, world expansion introduces new characters, locations, artifacts, and threads — each seeded with knowledge asymmetries that drive future conflict. Expansion produces its own reasoning graph justifying why each new entity exists, then hands them to the next arc's causal graph as substrate. Long-range phases provide structure; reasoning graphs provide short-range causality that evolves arc by arc.

Variable Scenarios

The room rarely needs one future. It needs a spread of them, weighted by how plausible each is. Reasoning graphs commit to a single chain — the arc's spine. Variable scenario modelling is the complement: a cohort of timelines with relative probabilities, presented to the room as a compass of decisions worth playing. The reasoning graph asks what must happen and why; variables ask what could happen, and how likely.

The arc decomposes into a small pool of variables — the load-bearing forces that most reshape trajectory if they shift. Each scenario is one coordination over that pool: a pattern of intensities (0 off → 4 extreme). A priorLogit ∈ [-4, +4] scores each scenario relative to its siblings; softmax across the cohort produces the displayed probability — the cohort reads as a compass over possible continuations.

46%Modal continuation

28%Slow consolidation

18%External disruption

8%Mechanism rupture

Parallel coordinates over five variable axes. Each scenario is a polyline; vertical position is intensity (0 off → 4 extreme). Probabilities are softmax over per-scenario priorLogits — most mass clusters on modal continuations, a thin tail covers rupture.

Two surfaces, power-law shape

Present — the arc's load-bearing variables right now. Future — a cohort of next-arc scenarios over a shared pool. Reality doesn't distribute uniformly: most futures cluster near modal continuation; a thin tail covers rupture. The cohort matches the shape it's drawn from — tight when the possibility space is tight, fat-tailed when a load-bearing mechanism could ignite. PriorLogit is independent of intensity: intensity carries magnitude, the logit carries rarity.

From scenarios to branches

Scenarios drive Scenarios: one parallel arc continuation per scenario. On commit, every scenario attaches as a sister branch; the softmax-top scenario's branch becomes active. Every committed run carries the variable fingerprint that produced it, so the substrate can compare what actually played out against the prior the Compass assigned.

Voice

The room ends each session with a record — negotiated agreements, commitments, reveals, the narration the substrate writes up. That record is prose, and it deserves the same craft as any authored work. Generation separates content (what is written) from accent (how it sounds). Content comes from beat plans — blueprints specifying the work each paragraph performs. Accent comes from prose profiles — statistical fingerprints of authorial voice reverse-engineered from published works the room can choose to write in. Each beat is classified by function (a 10-item taxonomy: breathe, inform, advance, bond, turn, reveal, shift, expand, foreshadow, resolve) and delivered through one of 8 mechanisms (dialogue, action, thought, narration, environment, memory, document, comic). The payoff: structural control without stylistic constraint. Swap the profile and the same session renders in a different accent.

Pacing is controlled by Markov chains over the same vocabulary^{Norris 1998}. Two layers, both derived the same way (classify each unit, count consecutive transitions, normalise rows). Layer 1 operates at the scene level — an 8-state matrix sampling force profiles for pacing. Layer 2 operates at the beat level — a 10-state matrix over beat functions controlling the texture of the written record.

Layer 1: Pacing Chains (Scene → Scene)

The eight cube corners form a finite state space. Each scene occupies one corner; consecutive scenes form an empirical Markov chain , where is the probability of moving from mode to mode . Raw forces are computed per scene, z-score normalised across the novel, and classified into corners.

Harry Potter and the Sorcerer's Stone — pacing chain. 73 scenes, 72 transitions, 38 unique edges.
Node size = visit frequency. Edge thickness = transition count.

Harry Potter's pacing chain is broadly distributed: entropy 2.78/3.00, self-loop rate 20.8%, fate-to-buildup ratio 35/38. Rest (16 visits) and Closure (15) lead, with Growth (12) and Epoch (11) close behind — the story spends most of its time either breathing or earning its peaks, with high-force scenes punctuating rather than dominating. The strongest transitions (Rest→Rest 5x, then Closure→Growth, Climax→Rest, and Epoch→Closure each 4x) show a rhythm of build, culminate, settle, build again.

Other works produce strikingly different fingerprints. Nineteen Eighty-Four is fate-heavy — 72% of scenes land in the top four corners, reflecting Orwell's sustained pressure rather than Rowling's pivoting. The Great Gatsby oscillates between Epoch and Rest with little middle ground — Fitzgerald's pendulum rhythm. Each work's transition matrix is a measurable authorial signature.

Before generating an arc, the engine walks the active matrix for N steps, producing a sequence like Growth → Lore → Climax → Rest → Growth. Each step becomes a per-scene force target. Users pick a rhythm profile derived from a published work: a story on Rowling's matrix pivots constantly between peaks, one on Orwell's sustains pressure then erupts. Whether Markov guidance beats unguided generation on composite score is a testable claim we have not yet run in controlled experiment.

Layer 2: Beat Chains (Beat → Beat)

Pacing chains control which force profile a scene hits. Within a scene, the prose itself has structure — a sequence of discrete beats, each a specific narrative function delivered through a specific mechanism. An LLM decomposes scenes into beats classified against a fixed taxonomy of 10 functions and 8 mechanisms.

The 10 beat functions describe what each section of prose does: breathe (atmosphere, grounding), inform (knowledge delivery), advance (forward momentum), bond (relationship shifts), turn (pivots and reversals), reveal (character nature exposed), shift (power dynamics invert), expand (world-building), foreshadow (plants for later fate), resolve (tension releases).

The 8 mechanisms describe how each beat is delivered as prose: dialogue, thought, action, environment, narration, memory, document, comic. A single beat function can be delivered through different mechanisms — a reveal can land through dialogue, action, or narration, each producing a different texture.

The methodology mirrors the pacing chain exactly: extract beat plans from every scene of a published work, tally consecutive function→function transitions, normalise rows, and produce a Markov matrix . Applied to Harry Potter and the Sorcerer's Stone, beat-plan extraction yielded 1,254 beats across the 73-scene novel — roughly 17 beats per scene:

Harry Potter and the Sorcerer's Stone — beat chain. 1,254 beats, 1,163 transitions, 92 unique edges.
Node size = beat frequency. Edge thickness = transition count.

The chain reveals advance as the dominant hub (329 beats, 26%) — momentum is Rowling's connective tissue. The strongest single transition is inform → advance (98x): knowledge delivery triggers action. Breathe feeds almost exclusively into inform (82x) and advance (56x) — atmosphere exists to launch the next movement. All 100 pairs appear at least once; the matrix is dense.

Other works shift the pattern. Nineteen Eighty-Four gives reveal unusual prominence — a mind trapped between inner world and outer surveillance. Gatsby leans on dialogue and narration — Fitzgerald's observer-narrator reporting. Alice is advance-dominant with minimal bonding — a protagonist propelled through episodes without deepening relationships.

Alongside the transition matrix, the analysis extracts a mechanism distribution. Harry Potter is dialogue-heavy (42% dialogue, 29% action, 16% environment, 6% thought, 5% narration) — a conversation-driven pedagogy where characters explain magic by arguing, teasing, and showing off.

Combining the Chains

Two independent chains, orthogonal axes: what happens (LLM from narrative logic), how intensely (scene-level pacing chain), and how it reads (beat-level prose chain). Both derived empirically from published works.

Reconstruction

First drafts are rough; first sessions are messy. Evaluation reads scene (or session) summaries and assigns per-scene verdicts; reconstruction creates a new versioned branch, applying verdicts in parallel — edits revise content, merges combine scenes, inserts fill gaps, moves reposition without any LLM call, cuts are omitted. World commits pass through at their original positions. The original branch is never modified. The room can replay an old session into a cleaner version of its own past without losing what actually happened.

okStructurally sound, continuity intact. Kept as-is.

editRevise content — may change POV, location, participants, deltas, and summary.

mergeAbsorbed into another scene. Both scenes' best elements combined into one denser beat.

insertNew scene generated to fill a pacing gap, missing transition, or stalled thread.

cutRedundant. Removed — the narrative is tighter without it.

moveContent correct but wrong position. Repositioned after a target scene using moveAfter. No LLM call — prose preserved exactly.

Evaluations can be guided with external feedback — from another AI, a human editor, or the author's own notes. Each reconstruction produces a versioned branch (v2, v3, v4); the loop converges in 2–3 passes. Structural branching uses git- like reference sharing so a 200-scene narrative with 10 branches stores far fewer than 2000 scene objects.

War Rooms

The War Room is the product. Give it a text corpus with enough depth across System / World / Fate (SWF), and it deploys into a playable room: a vision-rendered board where human operators and AI agents sit around the same state, take turns, and move pieces against a shared rulebook the engine arbitrates. Narrative was the validation substrate; the room is what the substrate is for. Same engine, same forces, same fork-and-commit math underneath — nothing in this section is speculative architecture. It is the product running today. ^{Perla 1990}^{Schelling 1960}

The board picks the geometry. A War Room renders the world view as a board the operators can read at a glance. The substrate underneath stays the same; the surface on top is chosen to match how the modelled world actually moves. Four spatial primitives are first-class:

·Graph — nodes and edges. The default; network reads of who knows whom, who controls what, what causes what. Suits influence campaigns, conspiracies, supply chains.
·Grid — rectangular cells. Discrete positions with adjacency that respects rows and columns. Suits city blocks, market tiers, org charts, abstracted territory.
·Hex — hexagonal cells. Equidistant neighbours and natural radial movement; the classical wargame board. Suits military campaigns, pursuit dynamics, regional contests.
·Map — continuous geographic space. Coordinates, distances, terrain. Suits real-world scenarios where actual geography is the binding constraint — logistics, geopolitics, expedition modelling.

Switching surface is re-projection, not rebuilding — the same substrate read through whichever geometry the room needs.

Information-asymmetry driven card gameplay. Strategy is about what each player knows, what each player thinks the others know, and which signals are worth sending. The War Room makes this structural. Every operator drives one or more entities; every entity keeps a private log — its hidden state, actual intent, secret information — and emits public summaries the global tracker reads. Cards are the channel between the two: every card is a signal of intent played in the open while actual intent stays private until the player chooses to disclose, leak, or never reveal. Cooperate by tabling a card that matches your private log; defect by playing one that doesn't. The cost of the gap is what reveals charge.

Phased turn-based play. Each turn cycles through ordered phases — intent, negotiation, commit, reveal, resolution — with cooperation, defection, and strategic-objective tracking layered on. The Compass exposes the current decision space: load-bearing variables and the activations they admit. Each operator is dealt a hand from their entity's available moves, commits a card against the space — face-up to signal publicly, face-down to commit privately — and during the commit-to-reveal window negotiates: disclosing cards, trading information, proposing deals, threatening defections, withholding. Empty seats fill with AI agents from configurable profiles (compete / cooperate / extract / spoil) playing the same hand against the same Compass. Reveals resolve in declared order; the decision matrix scores the round; the engine steps state forward one tick. New variables surface, old ones decay, hands redeal, the next turn opens. A deck dealt against a moving world, one step at a time.

Every commitment becomes a thread delta. Every reveal updates priors. Every negotiated agreement is a new system rule the next move either honors or breaks. Operators play the shape of the move; the prose layer renders the texture; the priors track what the move committed. The War Room is not a microsim — one card, one negotiation, one resolution per phase — so the room moves at the pace of strategic conversation, not tactical execution. AI-dealt hands surface the cohort of plays the priors say make sense right now; operators can also author their own cards — bespoke decisions, side-channel proposals, disclosures not in the dealt hand. AI keeps the game honest to the substrate; custom cards keep it honest to real intuition.

Multiple play-throughs. The room plays the future several ways — the high-mass compass scenarios first, then free-form branches operators want to test on instinct. Each play-through is a fork; the substrate keeps them all and grades each against what actually shows up in reality between this session and the next. Operators learn which of their reflexive hypotheses align with the calibrated cohort and which don't — one of the loops by which the room compounds.

Eyes on the board, eyes on the graph. Most operator time lives on one of two surfaces: the map / board state — the spatial rendering of where the world stands — or the raw graph substrate — the typed knowledge graph underneath. Everything else (cards, dialogue logs, settings) is fast plumbing. The two visual surfaces are seeing surfaces: the operator originates the which world, which seat, which move the model's prediction is for. Vision is the human contribution; the room renders it, the substrate records it.

Stakes — fictional, reality-anchored, or real. Light: ELO, leaderboards, the satisfaction of a calibrated call. Middle: forecast questions graded against the record, with prize pools. Heavy: real trades and positions recorded as commitments. Stakes turn rehearsal into skin-in-the-game rehearsal — the cost of a dishonest signal becomes structural.

Agency, not price. Prediction markets reduce every participant to a price and a position size.^{Hanson 2003}^{Wolfers & Zitzewitz 2004} A War Room asks you to play the actor that produces the outcome — take a seat, hold a private log, signal with cards, negotiate, commit to a phased move. The market gives you a number; the room gives you a role and a story you helped author — calibrated probabilities plus the causal chains that got there. Private rooms compound one operator's edge; public rooms aggregate played seats from anyone who joins (Operating Model for the breakdown).

The room is also an expert system. The substrate isn't only a stage for play — it's an authored body of knowledge the team can interrogate. The expert system is materialised as a multiple-choice question bank, derived directly from the world view's scenes and arcs (“Given this regime, what move from the regulator?” “Which faction holds leverage when the supply line is cut?”). Each question carries the substrate's calibrated answer plus a set of distractors — the plausible-but-wrong reads the substrate weighs against, which is where the nuance lives. The bank is embedded as vectors so it's searchable by meaning: when an operator asks something new, semantic retrieval over the bank surfaces the closest known questions and their answers as grounding (Expert System RAG). The bank serves three roles at once — acclimation (new operators learn the team's subjective world by working through it; each world has its own rules, the bank is how you learn them), testing (distractors expose where intuition diverges from the calibrated priors; expert disagreement opens new threads), and answering (direct questions get RAG-grounded responses from accumulated decisions). The bigger the bank, the more real decisions covered, the stronger the expert system. The room is therefore three things at once — a role-play simulator for rehearsing the future, a strategy table for deciding on it, and a living expert system the team owns and grows.

The headline, plainly: once the SWF priors are deep enough, a playable War Room is what the engine produces. Prime the substrate across System, World, and Fate. Pick the spatial type the world wants. Deal the cards. Begin.

The Loop

The War Room is the heaviest mode of engagement, not the only one. Five surfaces share one substrate; the value of the product is the loop between them.

·Update priors. Articles, transcripts, observations drop into the queue between sessions. The opening walks it; what shifts a prior enters the substrate. Automated feeds are optional aids; the human curates.
·Play the future. The full War Room — phased turns, cards, multi-seat negotiation, AI agents filling empty seats. Heaviest mode; richest source of new deltas.
·Generate forward. The engine extends the substrate without a session — scenario cohorts, reasoning graphs, branch continuations. Solo prep before a meeting, or an operator exploring a fork on their own.
·Answer the question bank. Every committed play generates calibrated questions with distractors that encode the plausible-but-wrong reads. Onboarding from session memory, not a deck of slides.
·Query the expert system. The substrate is searchable end to end — every committed fact, every entity's recorded knowledge, every thread's trajectory, every prior resolution. Lightest cadence in the loop: any operator, any time.

You don't need to play the war games to benefit. The question bank and the query surface drive value on their own. A solo operator can run scenarios without convening anyone. The team-weekly War Room is the deepest source of new deltas; it isn't the only entry point.

What makes it compound. Each mode feeds the others. A query surfaces a coverage gap that goes into the queue. A queue update reweights the question bank. A well-answered question becomes a prior the next play uses. A play generates new questions, new scenarios, new entries the query surface pulls from tomorrow. Whichever surface the operator engages, the work compounds.

Practice

The cadence that produces the richest substrate is the team that institutes the War Room weekly or monthly. The lighter surfaces (question bank, expert-system query) live in The Loop and need none of the discipline below. What lives here is the heavy ritual: meeting regularly to build reflexes for adversarial moves, calibrated priors on which futures arrive, doctrine that survives the day a surprise lands, and the shared expert system every member contributes to.

The expert system requires training, and the training is the practice. Expert systems are not one-time deliverables — they decay the moment the world stops being updated. Every session is a training pass: the team walks the queue, plays the future forward, scores the round, refines the question bank, disagrees with the distractors. Stop playing for a quarter and the priors go stale. The maintenance loop is what keeps the expert system honest to a world that is itself still moving.

Weekly War Rooms — for what moves fast (markets, current ops, competitive intel, a campaign in flight, a live policy file). One to two hours. Each operator walks in with the week's fresh signal; the opening integrates it; the rest plays the implications forward.

Monthly War Rooms — for what moves slow (doctrine, portfolio composition, organisational design, multi-year political bets). Two to four hours. Players come with accumulated reading; the room rehearses strategic shape — not the next move but the kind of move the operator reaches for under pressure six months from now.

Who it's for. Five honest use cases:

·Personal simulation. A career pivot, relocation, treatment plan. The solo operator takes their own seat plus the seats of consequential others (employer, market, family, regulator) and plays the move forward.
·Investment. A portfolio committee plays positions against management response, competitor hunts, and macro regime. Calibrated priors on capital-structure stress and exit-path optionality become artefacts of the room.
·Politics. A campaign team plays the opposition's, regulator's, and media's next moves. The rehearsed campaign moves faster the morning of the leak than the unrehearsed one meeting it for the first time.
·Strategy. Executives convene with the seats of competitors, regulators, customers, channel partners staffed by operators or AI. Every committed move updates the substrate; the decision becomes a prior the next room inherits.
·Board game dynamics. A tabletop group uses the room as a high-fidelity grand-strategy sandbox. Information asymmetry, faction-internal politics, diplomacy, multi-turn doctrine all native.
·Knowledge transfer & expert systems. The team's expert system is materialised as a multiple-choice question bank generated from the substrate's scenes and arcs — embedded for semantic search so new questions can be answered by retrieving the closest known ones. New hires acclimate to the team's subjective world by working through the bank (each domain has its own rules; the bank is how you learn them); senior experts refine it by disagreeing with the distractors; the substrate keeps the disagreements. The bigger the bank, the stronger the expert system.

What practising buys. Reduced detail on purpose — the substrate is a working model, not an archive. Curation, not capture. The fiftieth weekly War Room is qualitatively different from the tenth: fifty cycles of rehearsal, fifty curated drops, fifty graded forecasts, hundreds of questions answered, and the expert system that emerges from them. The team that practises earns the morning the surprise lands.

Architecture

The specific tools below are the current stack; component names will change as the ecosystem moves. The architectural read is what to take forward: local-first, LLM-as-gateway, host-runs- the-room.

The substrate ships as a single Next.js application with React 19 on top, Tailwind for surface, and D3 for the map / board / graph views that own most of an operator's screen. The backend is essentially a pass-through to an LLM gateway; everything that compounds lives on the operator's machine.

·Next.js 16 + React 19 — app shell, App Router, the few server endpoints that need one (image generation, LLM calls).
·Tailwind v4 + D3.js — visual language and the two seeing surfaces: the spatial board, the typed knowledge graph.
·IndexedDB — local persistence for everything that compounds. No backend database; nothing leaves the browser unless the operator chooses.
·LLM gateway (OpenRouter) — routes to the cheapest model that meets each pipeline stage's bar; current split is DeepSeek for generation, Gemini Flash for planning and analysis.
·OpenAI embeddings + Replicate — 1536-dim vectors over every scene, beat, and proposition for semantic search; Seedream 4.5 for board art and covers.

Local-first is a feature, not an inconvenience. A private War Room is the operator's substrate on the operator's hardware — no vendor with read access to confidential plays, no per-user database on Meridians's side. The operator's machine is the room's server. This is what makes the privacy moat real and what keeps per-private-room infrastructure cost at effectively zero.

One host, many seats. A local host doesn't need to ship the room to the cloud to invite players. The host runs the app; an ngrok tunnel exposes the local Next.js server through a temporary public URL; guests scan a QR code on the main display and join from their phones, no install required. The host owns the substrate; phones become controllers. The main display (a laptop, a TV in the room, a shared screen on a video call) carries the map / board / graph — the seeing surfaces every operator watches together. Each phone is a private hand: cards to play, the private log, side-channel disclosures, the disclose / leak / hold decision. Combined with screen sharing in whatever meeting tool the room already has open, remote players see the same board the local players see and play through their phones the same way. The architectural payoff is that a War Room can be impromptu — a host fires up the app, drops a QR code in the chat, the table is dealt.

Electron for the coherent install. The browser experience is the daily driver, but a War Room instituted as a habit deserves to feel like an app, not a tab. An Electron bundle wraps the same Next.js build into a single binary the operator launches like any desktop application — keyboard shortcuts behave, the IndexedDB substrate persists in a known location, the tunnel-and-QR surface is one menu item away. Same code; the surface around it just stops being a tab among many.

Scrapers for the queue. The queue lives or dies on what fresh signal arrives. Operators curate by hand today; the architectural extension is a layer of scrapers aggregating from sources the room cares about — markets, news, sector trackers, regulatory filings, relevant literature. Scrapers don't bypass the human queue; they pipe candidates into it. Scope is still being worked out (which sources, what cadence, how aggressively to deduplicate); direction is clear, shape isn't committed.

Operating Model

The moat is client-owned compounding judgement no vendor can ship from cold. A team running weekly War Rooms accumulates a working model of its own strategic position — priors, calibrated reads, an embedded question bank with the distractors that encode the plausible-but-wrong moves the team has seen and named. That artefact lives in the substrate's history, on the client's machine, authored by the people who live inside the problem. A foundation-model vendor cannot ship it; a competing tool cannot port it; we cannot rebuild it for a different client. It is the one thing in the product that compounds, and it compounds for the team that built it. Everything else — the engine, the prompts, the math, even the facilitation playbook — is commoditisable on a long enough horizon. This is.

The unit is simulation-as-subscription. A client buys a substrate that grows sharper across sessions, a vocabulary for thinking adversarially, and a facilitation layer that keeps the meeting on the calendar — designed to fade out as the client takes over running its own rooms.

Two surfaces, sequenced. Private rooms ship first — closed tables on the local data model, the surface we have conviction in and revenue against. Public rooms are the phase-two ambition: community games on substrates of broad interest, maintained by admins or trusted curators. Whether the public surface lands at scale is the open question we are not pretending to have answered; the base case in Economics treats it as zero.

Time-to-value is honest. Standing up a defensible substrate on a new domain takes around 2–3 months — corpus ingestion, prior calibration, scenario surfacing, facilitator briefing, and the first half-dozen calibration sessions. The early weeks are a build-out, not a meeting; the pilot motion exists for exactly this reason. The product isn't turnkey, and we don't pretend otherwise.

Versus a foundation-model vendor. A vendor will ship a “Strategy Mode” chat in a quarter. What it can't ship cold is a room with history — priors compounded across months of sessions, a private substrate the operator owns, and force-measured scoring instead of fluent guesses. Foundation models commoditise generation; they don't commoditise compounded practice on client-owned state.

Versus a lighter alternative. The real competitor is not a frontier model; it is a stripped tool that delivers 70% of the value for 10% of the effort and never asks the operator to maintain a substrate. The behavioural moat is fragile and we know it. Our answer is to make the maintenance loop deliver something the lighter tool literally cannot — the team's own expert system, queryable by anyone they hire next year — and to keep proving it on real engagements before the lighter tool ships its v2.

The bet is the practice. The shape: private subscription for the working surface + value-add layers on top + free public distribution where it earns its way. Numbers follow.

Economics

Numbers below are today's figures — LLM costs and pricing intent will move. The shape of the model (private subscription covers a session that costs cents; public is amortised across a cohort; pilot is the wedge) is what the section is about.

LLM cost is cheap; loaded margin is the honest number. A session costs ~$0.30–$0.50 in LLM on the current model split; a weekly cadence is $1.50–$2.00 per private room per month. That is the easy part. Once customer success, facilitation, and amortised CAC are folded in, contribution margin on the facilitated tiers sits in the 35–60% band — respectable, but not the software-margin daydream an LLM-only view produces. The architecture is high-margin; the early go-to-market is services-shaped. We are not pretending otherwise.

Client-led is the structural ceiling. The fastest path to true software margin is for the organisation to graduate to running its own sessions. When the client supplies the facilitator, the heaviest line in COGS disappears and loaded margin steps up sharply — ~75%+ on Team and Pro, comparable on Pilot once the priming work is done. The product pitch to a serious operator is to acclimate inside the first 90 days and run rooms internally thereafter; we keep answering hard questions but stop being a services line item.

Public cost is hypothetical. A cohort of 1,000 would amortise LLM cost to ~$0.01 per player per season, with revenue from pro subs, opt-in betting, sponsorships, or media. The architecture allows it; the cohort does not exist yet. We are not banking on this.

Private — subscription-as-substrate

margin = LLM + CS + facilitation + amortised CAC

Tier	Price	Loaded margin		Role
Tier	Price	w/ our facilitation	client-led	Role
Solo	$19 / mo	30–40%	30–40%	Funnel into Team. Near-zero net revenue, not the bet.
Team	$99 / mo	50–60%	75–85%	Core paid surface. Committees, cells, leadership teams.
Pro	$499 / mo	35–45%	70–80%	Software + optional senior facilitation.
Pilot	$80–120K / 6mo	40–55%	65–75%	Wedge motion. Defence, hedge, political-research.

Public — free-to-play distribution + value-add layers

LLM amortised across cohort · base game free

Stream	Cost / unit	Revenue / unit	Detail
Free Tier (base)	~$0.01 / season	$0 (distribution)	Hosted substrate · LLM amortised across cohort · the funnel
Pro Subscription	~$0.10	$9.99 – $19.99	Analytics, ELO history, private clones, ad-free, priority play
Betting Markets	~$0.05 (payment + KYC)	~$2 – $10 (3–5% rake on book)	Opt-in real-money markets attached to specific plays · jurisdiction-gated
Media / Sponsorship	negligible	variable	Brand partnerships, league licensing, premium content tiers

Loaded margin folds LLM cost, customer success, facilitation hours, and amortised CAC — not the 90%+ figure an LLM-only view produces. The client-led column is the structural ceiling: when the organisation runs its own facilitation, the service line drops out of COGS and margins approach software-shape. The pitch to a serious operator is to graduate there. Solo at $19 is a funnel, not a revenue line; the base case rests on Team subs and pilot ACV. Public economics stay separate — the cohort doesn't exist yet.

Wedge. A 6-month pilot partnership with a single defence contractor, hedge fund, or political-research shop — targeted ACV in the ~$80–120K range, structured as priors-accumulation against a real question. Services-heavy on day one and we treat it that way: the pilot funds itself, produces the case study the next ten sales require, and seeds the substrate the client owns afterwards. In parallel, Team subs ($99/mo) at boutique investment committees, family offices, campaign cells, M&A teams, policy units — the lower-touch surface that builds the recurring line.

Scale math — scenarios. Bear ~$500K — one pilot, no compounding, Team adoption stalls. Base ~$3.5M — ~200 Team subs + two to three running pilots with at least one renewing into a multi-room expansion. Solo is funnel-only in this base, not a revenue line. Base alone is venture-defensible. Upside beyond this — public layer at scale, a betting vertical — we keep separate so it doesn't muddy the near-term plan.

Early evidence we owe. Two milestones collapse the most uncertainty in the model. Week-12 retention on heavy-mode non-founder users — whether weekly War Rooms hold past the novelty window for operators who aren't us. One demonstrable expert-system answer from an accumulated substrate — a real query, on a real client's priors, returning a calibrated read no foundation-model chat could produce from cold. Until both exist, the strongest part of the pitch is theoretical; once they exist, the moat narrative is concrete.

Honest risks. Seven we are watching, in rough order of worry. Adoption friction — weekly rituals are hard to sustain; week-4 / 8 / 12 retention is the leading indicator. Service margin disguised as software margin — facilitation, customer success, and CAC keep loaded margin in the 35–60% band on facilitated tiers until clients self-facilitate. Behavioural moat fragility — the real competitor is not a foundation model; it is a lighter tool that delivers 70% of the value for 10% of the effort and never asks the operator to maintain a substrate. Foundation-model encroachment — priors and client-owned local data are the answer, but they have to keep being earned every quarter. Pilot doesn't generalise — one case study at a hedge fund may not port to defence or political research without a second priming motion. Solo as standalone — weekly-commitment products at therapy prices churn; Solo only works as a funnel, never as a revenue centre. Public-game cold start — the hardest one; a free game with fifty players isn't a community.

Investability fork. The base case justifies a modest round on private rooms alone — a services-led SaaS thesis where the moat is client-owned compounding judgement no vendor can ship from cold. The larger round implies the public layer lands, which is the bet that doesn't need to be made yet. We are happy to take either, and we would rather under-promise the public layer than over-promise it.

Coda

What Meridians does is convene a room to play the future before it arrives, on a measured substrate that learns from every committed move. Cards signal intent in public. Private logs hold actual intent. The substrate keeps the ledger. The next room inherits the priors. Any coherent text can seed the first session — a history, a paper, a novel, a doctrine, a market brief. After that, the room authors its own world.

Three things we believe. Good strategists rehearse the future, and game-like environments are how that rehearsal compounds. The pattern is visible in every serious military, hedge fund, and campaign that already runs the practice unstructured; the War Room is the structured version. Vision is the human edge over AI. Models scale prediction; they do not originate the act of seeing a future and choosing to play toward it. The map and graph surfaces are visual because that is where the rehearsal happens. Private rooms are the product; public rooms are the open question. Private is what we are selling and what we have conviction in. Public is what private credibility earns the right to attempt.

Convene the room. Practise the future. Earn the morning the surprise lands.

Bibliography

Inline citations follow author-year style. References are listed below in plain author-title-venue form.

Berlin, I. (1953). The Hedgehog and the Fox: An Essay on Tolstoy's View of History. Weidenfeld & Nicolson (Princeton reissue 2013).

Tetlock, P. E. (2005). Expert Political Judgment: How Good Is It? How Can We Know?. Princeton University Press.

Tetlock, P. E., & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown.

Kahneman, D., Lovallo, D., & Sibony, O. (2011). Before you make that big decision. Harvard Business Review, 89(6), 50–60.

Lovallo, D., & Kahneman, D. (2003). Delusions of success: How optimism undermines executives' decisions. Harvard Business Review, 81(7), 56–63.

Hanson, R. (2003). Combinatorial information market design. Information Systems Frontiers, 5(1), 107–119.

Wolfers, J., & Zitzewitz, E. (2004). Prediction markets. Journal of Economic Perspectives, 18(2), 107–126.

Polanyi, M. (1966). The Tacit Dimension. University of Chicago Press (reissue 2009).

Goodhart, C. A. E. (1975). Problems of monetary management: The U.K. experience. In Papers in Monetary Economics, Vol. I. Reserve Bank of Australia.

Strathern, M. (1997). 'Improving ratings': Audit in the British University system. European Review, 5(3), 305–321.

Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3.

Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359–378.

Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86.

Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed.). Wiley.

Liu, N. F., Lin, K., Hewitt, J., Paranjape, A., Bevilacqua, M., Petroni, F., & Liang, P. (2024). Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12, 157–173.

OpenAI (2024). New embedding models and API updates. OpenAI Blog, January 25, 2024.

Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Proceedings of EMNLP 2019.

Reagan, A. J., Mitchell, L., Kiley, D., Danforth, C. M., & Dodds, P. S. (2016). The emotional arcs of stories are dominated by six basic shapes. EPJ Data Science, 5(1), 31.

Boyd, R. L., Blackburn, K. G., & Pennebaker, J. W. (2020). The narrative arc: Revealing core narrative structures through text analysis. Science Advances, 6(32), eaba2196.

Peirce, C. S. (1903 / 1998). Pragmatism as the Logic of Abduction. In The Essential Peirce: Selected Philosophical Writings, Vol. 2 (1893–1913), ed. Peirce Edition Project. Indiana University Press.

Guilford, J. P. (1967). The Nature of Human Intelligence. McGraw-Hill.

Norris, J. R. (1998). Markov Chains. Cambridge University Press.

Elo, A. E. (1978). The Rating of Chessplayers, Past and Present. Arco Publishing.

Glickman, M. E. (1999). Parameter estimation in large dynamic paired comparison experiments. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(3), 377–394.

Perla, P. P. (1990). The Art of Wargaming: A Guide for Professionals and Hobbyists. Naval Institute Press (reissued 2012, History of Wargaming Project).

Schelling, T. C. (1960). The Strategy of Conflict. Harvard University Press.

Hogan, A., Blomqvist, E., Cochez, M., d'Amato, C., de Melo, G., Gutierrez, C., et al. (2021). Knowledge graphs. ACM Computing Surveys, 54(4), 1–37.