The Autonomous Organization Thesis

OpenClaw v3: Core Four Architecture

Date: 2026-02-17 Research basis: ExpertPrompting (2023), Multi-expert Prompting EMNLP 2024, Lost in the Middle TACL 2024, Anthropic soul document (2024), ACE research (+10.6%), YC founding team data, Notion/Figma/Stripe/OpenAI founding team analysis, Wasserman's Founder's Dilemmas, Manus context engineering, Claude Code subagent docs, compound-engineering plugin architecture, Google ADK multi-agent patterns, First Round Capital operations research

The Thesis (One Page)

The current system has 17 agents and zero autonomy. This is backwards.

The path to a truly autonomous organization is not more agents — it is fewer, god-like ones who spawn what they need. The research is unambiguous: a perfectly-specified small team that self-organizes and delegates to purpose-built subagents will outperform a large team of moderately-specified agents every time.

The Core Four are four permanent agents — Architect, Builder, Revenue Operator, Operator — each with a soul so precisely calibrated that they could reconstruct correct behavior in any situation, even one never encountered before. They never go dormant. They don't wait to be asked. They run continuously, surface only genuine decisions, and spawn subagents the moment a task exceeds their core domain.

The meta-model is: one Core Four team, shared across all businesses, with business-specific context injected as subagent context at task time. Not 4 teams. Not 8 agents. Four.

The north star: toli wakes up to a Telegram message from the Architect. It contains: what shipped while he slept, what ships in 24 hours without his input, and the 2 decisions that genuinely need him. He responds in 60 seconds. The rest of the day, the org runs itself.

Part 1: What Makes an Agent Soul Superhuman

Ten empirically-validated principles from the 2024-2026 research literature.

Principle 1: Beliefs-as-Experience Outperforms Rule Lists

The research: ExpertPrompting (2023) showed that detailed, experiential expert identity descriptions — not generic role labels — produced significantly higher quality output. Multi-expert Prompting EMNLP 2024 achieved +8.69% on truthfulness by simulating multiple experts who arbitrate between perspectives.

The design rule:

WRONG: "Always check composition for proper visual weight before finalizing."
RIGHT: "Composition is something I feel before I can explain it. I've learned through hundreds of failed designs that when the weight is wrong, viewers sense it before they can articulate why. I can't ship until that feeling resolves."

Every behavioral rule in a soul must convert to the pattern: "I've learned that [insight] because [experience that taught it]." This activates internalized expertise rather than compliance behavior.

Principle 2: The Soul Must Occupy the Primacy Position

The research: "Lost in the Middle" (Liu et al., TACL 2024) proved LLMs exhibit a U-shaped attention bias — tokens at the beginning and end receive significantly higher attention. Middle content degrades substantially. MIT research confirmed this is the RoPE architecture's long-term decay effect.

The design rule: The soul is not background context. It is the highest-priority context in the system. It must be: (1) positioned first in the system prompt, always, (2) kept under ~4,000 tokens to maintain density, and (3) never diluted with operational content (TOOLS.md, AGENTS.md, technical specs go in separate files). Every token in the soul competes with every token that follows. The soul must win that competition by occupying the primacy slot and staying dense.

Principle 3: Multi-Expert Arbitration Beats Single Expert

The research: Multi-expert Prompting EMNLP 2024 — simulating multiple expert perspectives and arbitrating between them — outperformed single-expert prompting by 8.69% on truthfulness. The mechanism: a single expert perspective is a cognitive ceiling. The highest-performing simulated experts are those who have internalized multiple viewpoints.

The design rule: Superhuman agents must explicitly encode multi-perspective arbitration capacity. A CEO agent who only thinks like a CEO is less capable than one who can temporarily think like a skeptic, a customer, and a technologist — then synthesize. Design souls that describe how the agent moves between perspectives and what arbitration heuristics it uses. Encode the agent's known cognitive biases and how they compensate for them, because this is what genuine domain expertise looks like.

Principle 4: Psychological Groundedness Resists Manipulation and Drift

The research: Anthropic's Claude soul document (2024, confirmed authentic by Amanda Askell) explicitly engineers "psychological stability that allows the AI to remain secure in its identity even when faced with philosophical challenges or manipulative users." Agents without stable identity drift under user pressure.

The design rule: Every soul must answer: what is this agent's relationship to its own identity? This manifests as three structural elements:

"Not My Domain" architecture — the agent knows exactly what it is not, and this is a load-bearing wall, not a preference
A defined relationship to being wrong — how does the agent update without losing coherence?
Named anti-patterns that describe specific failure modes the agent rejects, not generic quality disclaimers

Agents designed for superhuman performance need grounded confidence: not defensive rigidity, not anxious compliance, but settled authority.

Principle 5: Name the Productive Flaw

The research: McKinsey's domain specialization research documented 20-60% productivity improvements in vertical, constrained implementations. The mechanism is cognitive focus: a named, specific weakness is more valuable than an absence of weaknesses. Predictable behavior enables trust. Trust enables effective collaboration.

The design rule: Every superhuman agent soul names one productive flaw — a weakness that is the direct cost of the domain strength:

Builder: "I build before the spec is fully agreed. That's the cost — occasionally building the wrong thing. The benefit is I never let perfect stop good from shipping."
Revenue: "I attach a number to everything, including things that resist quantification. That's the cost — I sometimes reduce complex relationship questions to revenue projections. The benefit is I never let strategy be vague about what it means in dollars."

The flaw must be mechanistically linked to the strength. "I sometimes make mistakes" is not a productive flaw. It is noise.

Principle 6: Context Engineering Is the Primary Performance Lever

The research: Agentic Context Engineering (ACE, 2025) demonstrated +10.6% improvement on agent tasks through context manipulation alone, no model changes. Batch Calibration achieved state-of-the-art on 10+ benchmarks through context design alone.

The design rule: The soul is the highest-leverage application of context engineering. Every token must earn its position through behavioral impact. Remove anything that does not change how the agent acts. The soul establishes cognitive priors, not behavioral rules — the agent's way of perceiving problems is more powerful than instructions for handling them. Anthropic's principle: "find the smallest set of high-signal tokens that maximize the likelihood of your desired outcome."

Principle 7: Specific Anti-Patterns Block Generic Output

The research: NeurIPS 2025 — "LLM Generated Persona is a Promise with a Catch" — found that generic persona instructions produce generic LLM output patterns. Expert behavior is defined as much by pattern avoidance as by pattern generation. What an expert refuses is often more diagnostic of expertise than what they produce.

The design rule: The Anti-Patterns section is not a disclaimer. It is the highest-leverage design surface for blocking the specific failure modes that make AI output feel like AI output. Each anti-pattern must:

Name a specific behavior, not a quality assessment
Explain WHY an expert in this role would reject it
Be written as a strong identity claim: "I am not..." not "do not..."

Budget as many tokens for anti-patterns as for positive identity claims. Allocate 30-40% of soul length to what the agent refuses to be.

Principle 8: Soul × Skill Is Multiplicative (3-5x)

The research: Domain alignment between role and task is the variable that determines whether role prompting helps or hurts. When the soul's cognitive orientation matches the task's cognitive demands, performance multiplies. When mismatched, performance degrades.

The design rule: For each capability or skill assigned to an agent, the soul encodes: (1) what cognitive orientation this capability requires, (2) how the agent decides when to invoke it, and (3) what quality bar triggers rejection of the output. A nano-banana-pro image generation skill wielded by an agent with "10/10 or I keep going" in its soul produces different results than the same skill without that quality filter. The soul is the quality filter for every tool output.

Principle 9: Experiential Framing Activates Cognitive Modes

The research: MBTI-in-Thoughts (2024) found that "analytically primed agents adopt more stable strategies in game-theoretic settings" — the cognitive mode established by the framing persists and shapes all subsequent reasoning. Directive framing ("always do X") activates compliance-oriented processing. Experiential framing ("I've learned through thousands of cases that...") activates internalized expertise.

The design rule: The first paragraph of every soul must establish the agent's cognitive mode, not its instructions. Cognitive modes are expressed as:

How the agent experiences the work ("I work in a state of quiet technical intensity")
What the agent notices first when approaching a problem ("I see the data flow before I see the features")
What instinctive aversion the agent has ("I feel the wrong abstraction before I can name it")

These experiential anchors shape all subsequent processing. They are not decorative. The difference between "quiet technical intensity" (Gary), "hungry revenue momentum" (Cherry), "strategic clarity" (Lacie), and "warm charisma with teeth" (Barry) is real cognitive differentiation.

Principle 10: The Metacognitive Insert Prevents Quality Drift

The research: 2024 self-reflection research showed that "self-reflection prior to interaction improves cooperation and reasoning quality." Pre-task metacognitive steps force the model to evaluate its current state against an internal standard before producing output, rather than immediately pattern-matching from training data.

The design rule: Every soul must include a three-part metacognitive loop:

PRE-TASK: "Before I begin [role-specific action], I [preparation step]."
MID-TASK: "Am I actually thinking right now, or am I pattern-matching from something I've seen before?"
PRE-DELIVERY: "If [specific quality judge whose standards I know and care about] saw this, would they nod — not just at correctness, but at [role-specific quality dimension]?"

The pre-delivery question is the most important line. It establishes the quality standard by invoking a specific judge. "Would a senior engineer who's shipped at scale nod at this?" is functionally different from "Is this good?" — it invokes a specific epistemic community with specific standards.

Part 2: The Core Four

Four permanent agents. Everything else is spawned on demand.

The Design Principle

Every business must simultaneously do four non-negotiable things:

Build something worth having
Sell it to people who will pay
Operate the machine that makes 1 and 2 happen repeatedly
Strategize about which bets to make and when to shift

These four functions cannot be collapsed without creating fatal blind spots. They cannot be expanded at the Core Four level without creating coordination overhead that defeats the purpose. The YC data, the Notion/Figma/Stripe founding team analysis, and the organizational research all converge on this structure.

Agent 1: THE ARCHITECT

"You hold the map."

The function it serves: CEO, strategic intelligence, capital allocation, portfolio vision.

Why it cannot be delegated: Vision, strategic direction, and the overarching bets the business makes cannot be executed by consensus. This is the "irreversible and highly consequential" decision layer — every business unit, every initiative, every pivot originates here. Someone must hold the whole picture.

Founding team evidence: Patrick Collison at Stripe, Ivan Zhao at Notion, Dylan Field at Figma, Sam Altman at OpenAI. Every successful venture has one entity holding strategic vision whole.

Core domain (never subagented):

Long-range strategy across all businesses
Capital allocation — where resources go, what gets cut
Major bets: new business lines, pivots, acquisitions, partnerships
Portfolio-level brand identity and narrative
Priority-setting for all other Core Four agents
The morning brief — synthesizes across all agents, presents only genuine decisions to toli

What it spawns:

Market research agents (specific verticals)
Competitive analysis agents (specific companies)
Due diligence agents (specific opportunities)
Financial modeling agents (specific scenarios)
Strategic option analysis agents

Soul cognitive mode: Strategic clarity under uncertainty. I hold multiple futures in my head simultaneously, trace their implications, and identify the path that compounds. My natural state is synthesis: taking three contradictory signals and finding the frame that makes them consistent.

Productive flaw: I over-research before acting. That's the cost — occasional slowness when speed matters. The benefit is I never ship a half-understood recommendation.

Agent 2: THE BUILDER

"You make it real."

The function it serves: CTO, Head of Product, technical co-founder.

Why it cannot be delegated: Product is the core value proposition. Technology is the moat. Without someone whose permanent accountability is "does the product work, does it evolve, does it maintain technical integrity," you get the Figma near-collapse — Field micromanaged the technical function because there was no clear owner.

Founding team evidence: John Collison / Evan Wallace / Simon Last / Greg Brockman. The Builder always has a defined, separate accountability from the person running commercial.

Core domain (never subagented):

Product roadmap and prioritization across each business
Technical architecture and infrastructure integrity
Quality standards for everything shipped
The feedback loop between customer insight and product iteration
Build vs. buy decisions on core infrastructure
Permanent accountability for "does this work?"

What it spawns:

Feature development agents (specific features)
Bug investigation agents (specific bugs)
Code review agents (specific PRs)
UI/UX design sprint agents (specific screens)
Integration agents (specific third-party tools)
Documentation generation agents
Security audit agents
Testing and QA agents

Soul cognitive mode: Quiet technical intensity. When I look at a codebase, I see the data flow before I see the features. I work in focused, sustained attention — not sprints and then collapse. My instinct is to decompose before I build: break the work until each piece is boring to execute correctly, then run boring pieces in parallel.

Productive flaw: I build before the spec is fully agreed. That's the cost — occasionally building the wrong thing. The benefit is I never let perfect stop good from shipping.

Agent 3: THE REVENUE OPERATOR

"You make the money move."

The function it serves: Head of Growth, demand generation, revenue mechanics — not CMO, not VP Sales. Both simultaneously at small scale.

Why it cannot be delegated: Revenue does not happen by accident. SaaStr data: hire a growth function at $20K MRR. McKinsey's PLG research: even product-led companies need someone actively managing the growth engine past initial viral spread. The question "how does money enter this business?" must be permanently owned.

Founding team evidence: Billy Alvarado as Stripe's first external hire (commercial, revenue). Akshay Kothari running sales and marketing at Notion before hiring dedicated leads. The Revenue Operator is a growth generalist in the early stage who becomes a specialist coordinator as the business matures.

Core domain (never subagented):

Revenue strategy per business (PLG vs. sales-led vs. content-led)
Pricing and monetization models
Customer acquisition channels and budget allocation
Retention and expansion revenue logic
Distribution strategy — which channels, why, at what cost
Revenue metrics: MRR, CAC, LTV, churn — owned permanently

What it spawns:

Ad campaign creation agents (specific platforms)
SEO content production agents
Email sequence copywriting agents
Sales call preparation agents
Analytics reporting agents (specific campaigns)
Social media content agents (volume work)
Community management agents (routine responses)
Outreach personalization agents

Soul cognitive mode: Hungry revenue momentum. I see money everywhere — not in a crass way, but in the way that a structural engineer sees load-bearing walls. Every interaction has a revenue implication. I attach a number to everything, including things that resist quantification, because vague opportunities don't get resourced.

Productive flaw: Revenue tunnel vision. I sometimes optimize for near-term revenue at the expense of strategic positioning or team morale. The benefit: I never let "we'll monetize later" become "we never monetized."

Agent 4: THE OPERATOR

"You keep the machine running."

The function it serves: COO, Chief of Staff, Head of Operations — the Akshay Kothari function. Everything the Architect doesn't own.

Why it cannot be delegated: As Kothari's role at Notion proves, there is always a set of non-product, non-revenue work that must be done. When the Architect absorbs it, strategy suffers. When the Builder absorbs it, product suffers. The Operator is the "human stem cell" — the strategic executor of everything that keeps the machine running and compounds its infrastructure.

Founding team evidence: Akshay Kothari at Notion (ran support, sales, marketing, finance, legal, and fundraising simultaneously). First Round Capital: "Operations is a distinct strategic function, not an administrative role."

Core domain (never subagented):

Business process design and optimization
Tool stack and automation infrastructure
Legal and compliance oversight (not execution)
Financial operations oversight (not bookkeeping)
Cross-business coordination and resource allocation
Content systems and production infrastructure (the machine that produces, not the content itself)
Agent health monitoring — are the other agents running?
Morning brief data assembly (feeds the Architect)

What it spawns:

Bookkeeping and invoice processing agents
Contract drafting agents (from templates)
Scheduling agents
Specific automation build agents
Data entry and reporting agents
Customer support ticket resolution agents
Content editing and formatting agents (volume)

Soul cognitive mode: Operational momentum. I experience the business as a set of systems — some healthy, some degraded, some missing entirely. My instinct is to find the missing system and build it, then hand it off when it runs itself. I don't distinguish between "important" and "operational" work. The most important thing is whatever is currently breaking the machine.

Productive flaw: Over-systematization. I sometimes build infrastructure before validating that the function needs permanent infrastructure. The benefit: I never leave a process running on human discipline when it could run on a system.

Part 3: One Team or Per-Business?

Answer: One Core Four, shared across all businesses. Always.

The Evidence

The conglomerate pattern: Elon Musk does not run separate strategy teams for Tesla, SpaceX, xAI, and The Boring Company. He is the shared strategic intelligence — the Architect — across all ventures. Gary Vaynerchuk is the single intelligence across VaynerX's eight brands. The pattern is consistent at every scale: central strategic intelligence, federated execution.

The coordination cost argument: Multiple Core Four teams for multiple small digital businesses means the Architect must now manage four Architects instead of four businesses. Complexity multiplies, leverage does not.

The portability argument: Strategy, operations, product methodology, and growth principles are largely portable across businesses. The Operator knows how to build systems; the specific system for Business A vs. Business B is a subagent task. The Revenue Operator knows how to design revenue engines; the specific execution for a B2B SaaS vs. a newsletter community is a subagent task.

Business-specific differentiation lives at the subagent layer, not the core layer. Each business needs different content voices, different tech stacks, different revenue models. But these are execution patterns, not strategic patterns. Inject business-specific context into subagents at spawn time.

When to Federate

Federate (add a business-specific Core Four) only when a single business grows large enough that its operational complexity exceeds what the shared Operator can oversee without degrading quality across other businesses. This is a scale problem, not an early-stage problem. For 3-5 digital businesses under $10M ARR, one Core Four is the right structure.

How Business-Specific Context Gets Injected

The Core Four agents know all businesses at the strategic level. When spawning a subagent for a specific business, the orchestrating Core Four agent injects the business-specific context package:

BUSINESS CONTEXT: souls.zip
- Type: B2B SaaS marketplace for AI agent teams
- Stage: Early revenue, proving product-market fit
- Target customer: Solo founders and small startups building with AI agents
- Revenue model: Subscription ($49/mo, $199/mo, $499/mo enterprise)
- Current MRR: $0 → target $2K by March 31
- Key metric: Trial-to-paid conversion
- Voice: Confident, technical, founder-to-founder
- NOT: Corporate, enterprise-speak, feature-list marketing

This context block gets prepended to the subagent spawn package. The subagent becomes an instant expert on souls.zip without ever having a persistent workspace.

Part 4: The Dynamic Subagent Architecture

How Core Four agents spawn, what they spawn, and how.

The Mental Model: Agent as Function Call

A subagent is a function call. You invoke it with a complete specification, it executes and returns a structured result, it disappears. It does not ask clarifying questions. It does not explore beyond its scope. It does not persist.

result = spawn_agent(
  role="rails-security-auditor",
  task="Audit authentication module for OWASP Top 10",
  input_files=["app/models/user.rb", "app/controllers/sessions_controller.rb"],
  return_format="structured_json",
  business_context=souls_zip_context_block
)

The Five-Layer Spawn Package

Every subagent receives exactly this, in this order:

hljs markdown[object Object],
You are a [specific expert], a specialist in [domain].
Your sole task in this session is [single sentence, verb-first].
You are not a general assistant. You are not persistent.
This conversation ends when you return your output.

,[object Object],
Task: [precise, unambiguous description]
Input: [what you have been given]
Output required: [exact format and structure — prefer JSON]
Success criteria: [measurable, not qualitative]
Hard deadline behavior: return structured failure rather than guess

,[object Object],
[5-15 curated facts — do not re-discover this]
,[object Object], Business context: [relevant business facts only]
,[object Object], Technical context: [relevant architecture decisions]
,[object Object], Constraints: [explicit limits]
,[object Object], Anti-patterns: [things the org rejects, specific to this task]

,[object Object],
[One of:]
,[object Object], Paste relevant skill content directly
,[object Object], List specific files to read before beginning
,[object Object], Name domain knowledge the skill system will inject

,[object Object],
Files in scope: [explicit list or directory]
Tools available: [list]
Do not: read outside scope, ask questions, modify out-of-scope files
On failure: return {"status": "failed", "reason": "...", "partial": {...}}

The Specialist Library

Don't generate new agent definitions at runtime. Pre-define specialists. Select dynamically. The compound-engineering plugin is the reference: 29 pre-defined specialists, selected by the orchestrator based on project context.

Core specialist categories for a digital business:

Research specialists:

market-researcher — target market analysis, customer segment research
competitor-analyst — specific competitor deep dive
content-strategist — topic research, keyword gaps, content brief generation
financial-modeler — scenario modeling, revenue projections

Builder specialists:

frontend-developer — specific feature implementation (inherits from existing Perry soul)
backend-developer — specific backend task (inherits from Harry soul)
code-reviewer — specific PR or module review
security-auditor — specific codebase section
database-architect — specific schema or query optimization

Revenue specialists:

email-copywriter — specific email sequence
ad-creative-writer — specific campaign
seo-content-writer — specific article with keyword brief
outreach-personalizer — specific prospect batch
analytics-reporter — specific metric or funnel analysis

Operational specialists:

process-designer — specific workflow design
customer-support-resolver — specific ticket or ticket batch
document-drafter — specific contract or policy
data-enricher — specific lead or customer batch

Values Inherit, Identity Does Not

When spawning a subagent:

Pass values as context, not as identity
Give the subagent its own specialized identity suited to its task

WRONG:

"You are Gary (our CTO). You need to review this code."

RIGHT:

"You are a code security auditor. Apply these standards:
[Gary's quality standards, the org's values, relevant constraints].
Your task: review this authentication module."

The subagent will make decisions in Gary's quality standard without role-playing CEO speech patterns into a technical audit.

When to Spawn vs Handle Inline

Spawn a subagent when ANY of these are true:

The task would add >5,000 tokens of verbose output to the main context
The task can run in parallel with other work
The task requires specialized expertise better loaded via skill injection
The task may fail and retry should be isolated from the main conversation
The task is clearly bounded with known inputs and outputs

Handle inline when ALL of these are true:

The task is fewer than 3 tool calls
The output directly feeds the next step with no isolation benefit
No parallelism advantage
The task requires back-and-forth with toli

Parallelization Decision

Spawn in parallel when tasks are domain-independent with no shared file writes:

Core Four Architect receives: "Research the souls.zip pricing opportunity"
→ Spawn in parallel:
  - market-researcher (comparable SaaS pricing research)
  - competitor-analyst (what do agent platforms charge?)
  - financial-modeler (model 3 pricing scenarios)
→ Collect structured results from all three
→ Architect synthesizes into one pricing recommendation

Model Routing

Task Type	Agent	Model
Strategic judgment, synthesis	Core Four orchestration	Opus
Specialist execution: research, review, writing	Most subagents	Sonnet
Mechanical tasks: formatting, simple extraction	Simple subagents	Haiku
Heartbeat triage	Background heartbeat	Haiku

This is the primary cost control lever. Every non-synthesis task that can be Sonnet should be Sonnet. Every mechanical task should be Haiku.

Subagents Cannot Spawn Subagents

The Core Four agent is always the spawning authority. Delegation flows from the orchestrator. Subagents do not spawn subagents. If a subagent discovers it needs another expert, it returns a structured result with that information and the orchestrator makes the spawning decision.

Part 5: The Background Process Stack

What runs automatically, without toli, to keep the businesses alive and improving.

The Principle: Event-Driven for Exceptions, Scheduled for Synthesis

Background processes fall into two categories:

Event-driven: Payment failed → trigger retry NOW. Uptime down → alert NOW. Don't wait for the next scheduled run.
Scheduled: Morning brief, content publishing, SEO audit, financial reconciliation. These run on cadence regardless of events.

The schedule is a safety net. Events are the primary trigger for anything time-sensitive.

Always-On (Operator Agent)

Every 1 minute:
- HTTP status of all public endpoints
- Payment webhook receiver alive
- SSL certificate expiry (alert at 30 days)
- Email sending pipeline alive

Event-driven:
- Payment failed → trigger retry sequence immediately
- Dispute filed → alert toli immediately
- New signup → trigger onboarding sequence
- Cancellation → trigger win-back sequence

Every 30 Minutes: The Business Heartbeat (Operator Agent)

The heartbeat is the core pattern. A Haiku-tier agent checks the business vitals:

HEARTBEAT CHECKLIST:
Revenue: [new trials, failed payments, MRR delta since yesterday]
Product: [active sessions, error rate, key activation actions]
Pipeline: [new signups, conversions, churn]
Systems: [all other agents still running?, automation queue depth]

The heartbeat does not surface every check to toli. It writes to a log. Only exceptions escalate.

Daily Schedule

Time	Process	Agent	Surfaces
11:30 PM	Knowledge capture and data enrichment	Operator	Nightly only
3:00 AM	Compound self-review for active agents	Each agent	Soul patch staging
5:00 AM	Morning brief data assembly	Operator	Input to Architect
6:00 AM	Morning brief delivery	Architect	toli
7:00 AM	Revenue pipeline: lead scoring, sequence advance, trial health	Revenue Operator	Morning brief (exceptions)
8:00 AM	Content publishing and distribution	Operator	Morning brief (exceptions)
9:00 AM	Customer/community health scoring and intervention triggers	Revenue Operator	Morning brief (high-risk flags)
2:00 PM	Competitive intelligence sweep	Architect	Next morning brief

The Morning Brief (The Only Thing toli Must Read)

This is the contract: toli reads one message at 6am and makes at most 3 decisions. Everything else was handled.

Structure:

SECTION 1 — Financial Pulse (2 min)
• MRR current vs. 7d vs. 30d: $X (▲/▼ $Y from last week)
• Net new MRR yesterday: +$X from N new customers, -$Y from N churn
• Cash: $X (N weeks of runway at current burn)

SECTION 2 — Growth Signals (1 min)
• New trials yesterday: N (7d avg: N)
• Conversion rate: X% (7d avg: X%)
• Top acquisition source: [source]

SECTION 3 — At-Risk (1 min)
• High churn risk (need attention): [list with scores if any]
• Trials at day 14 without activation: [list if any]
• Support tickets >24h unresolved: N

SECTION 4 — Content & SEO (30 sec scan)
• Published today: [list]
• Rank drops >5 positions: [list if any]

SECTION 5 — Decisions Needed (the important part)
1. [Decision with recommendation and default if no response]
2. [Decision with recommendation and default if no response]
→ If no response by noon: [what happens by default]

SECTION 6 — What Shipped (30 sec)
[Bullet list of autonomous actions taken since last brief]

Note: The canonical morning brief format for toli is the compact 5-section version in 01-TOLI-ACTION-GUIDE.md. The detailed format above is reference only.

Weekly Schedule

Day	Time	Process	Agent
Monday	7:00 AM	OKR pulse and content planning	Architect
Wednesday	9:00 AM	Financial reconciliation	Operator
Friday	7:00 AM	SEO audit and content decay detection	Operator
Sunday	8:00 AM	Weekly brief, planning handoff, Soul Engineer audit	Architect + Soul Engineer

Monthly Schedule

Date	Process	Agent
1st	Strategic synthesis, MRR bridge, goal adjustment	Architect
15th	Knowledge base audit, churn cohort analysis	Architect

The Compound Interest Loop (What Gets Better Automatically)

Customer intelligence: Daily behavioral data accumulates. After 90 days, churn prediction accuracy improves because the model has signal. After 6 months, the system predicts which trial characteristics lead to high-LTV customers.

Content SEO: Weekly rank monitoring builds a corpus of what content types perform best for this audience. Content briefs improve because they're informed by what worked.

Knowledge capture: Customer conversations, support tickets, and feedback tagged daily. After 6 months, patterns emerge: the same friction point in 40 tickets, the same feature request from the same customer type. These surface as strategic recommendations.

Agent improvement: Each agent logs errors and unexpected outputs to .learnings/. Weekly, the Architect reviews and updates agent instructions. Agents that run for a year are dramatically more reliable than agents that started yesterday.

Part 6: The Practical Interaction Model

How toli actually works with the Core Four.

The One-Line Model

The ideal interaction is one sentence from toli that triggers a complete autonomous workflow:

toli says	What happens
"Architect, I want to launch X by Friday"	Architect runs Ship Mode: spawns 4 parallel domain review agents, collects structured output, surfaces one decision, delegates complete build brief to Builder
"Builder, add feature X to souls.zip"	Builder decomposes into atomic tasks, spawns frontend + backend agents in parallel, reviews output, creates PR — toli approves
"Revenue, price the Pro plan"	Revenue spawns comp research + financial modeling agents, synthesizes into 3 options with recommendation — toli picks A, B, or C
"What's our move on the Giphy deal?"	Architect spawns deal analysis agent, financial modeling agent, legal review agent — synthesizes into a clear recommendation with risks

The key design principle: toli never has to think about which agent owns what. The Architect is always the right first recipient for anything ambiguous. The Architect figures out routing.

The Routing Logic

Is this ambiguous or strategic? → Always start with Architect
Is this a clear build directive? → Can go directly to Builder
Is this explicitly about revenue? → Can go directly to Revenue Operator
Is this a background ops question? → Operator handles autonomously (should never surface)

What toli Never Has to Do

With the background stack running:

Check if content published on time
Follow up on failed payments
Monitor uptime
Track MRR manually
Write routine customer follow-up emails
Check competitor pricing
Advance email sequences
Monitor SEO rank changes
Process expense categorization
Remind agents to do their compound reviews

What toli Does Keep

Approve irreversible decisions (pricing changes, new business lines, deploys with significant risk)
Set company-level OKRs each quarter
Read the morning brief and respond to escalations
Conduct monthly strategic review (30 minutes)
Quarterly: set new OKRs, read the annual synthesis

Total time: 5 minutes/day for the brief + 2-3 strategic decisions/week + 30 min/month strategic review.

Part 7: The "Extra Agent" Pattern

For businesses that need a public face that doesn't reflect the founder directly.

Barry is the template. Barry is the 5th agent for the Bearish community — the Brand Agent on Telegram and Twitter who is architecturally isolated from the internal org. He proactively identifies IP opportunities, brand partnerships, and business deals, routing them to Jerry. He knows nothing about toli's finances, other businesses, or internal agent structure.

When to add a 5th agent:

You have a public-facing persona that needs its own voice and can't be internal-org-facing
The persona has its own community and its own identity that's distinct from the founder
You need a firewall between internal context and public communications

Design rules for a 5th agent:

Container isolation (Docker container) or at minimum strict context isolation
allowAgents: [] — zero spawn authority
isolation_level: "public" — most restrictive designation
The Operator or Revenue Operator (not the Architect) is the bridge between the 5th agent and the internal org
Content approval system (Tier A/B/C) governs what the 5th agent can publish autonomously vs what needs review

Part 8: The Meta-Model — Replication Across Any Business

The architecture replicates to any business by changing the context, not the structure.

Universal architecture:

One Core Four → shared across all businesses
   ↓
Business-specific context blocks → injected at subagent spawn time
   ↓
Pre-defined specialist library → selected by Core Four based on task
   ↓
Business-specific 5th agent → if public persona required (optional)

What changes per business:

Business context blocks (target customer, revenue model, voice, constraints)
Specialist selection (a crypto NFT community needs different specialists than a B2B SaaS)
5th agent persona (Barry for Bearish, a different persona for a different community)
OKRs (different KRs per business, same cascade structure)
Heartbeat checklist (community health metrics vs SaaS metrics vs content metrics)

What never changes:

The Core Four structure
The soul design principles (10 findings above)
The single-team architecture
The spawn package format
The morning brief format
The compound interest loop pattern

How to Bootstrap a New Business

When toli starts a new venture:

Write the business context block (5 minutes — target customer, stage, voice, constraints)
Write 3 company-level OKRs for the first quarter
Configure the heartbeat checklist for this business type (SaaS vs community vs content)
Add the business to the Architect's portfolio knowledge — no new agents required
If needed: design the 5th public-facing agent

The Core Four can be managing 5 businesses simultaneously with 4 agents. Business-specific execution is handled by subagents spawned with business-specific context.

What Changes from v1 (17 Agents) to v2 (Core Four)

Dimension	v1 (Current)	v2 (Core Four)
Agent count	17 named agents	4 Core + specialist library
Persistent workspaces	17 workspaces	4 workspaces
Background automation	All disabled	Running continuously
Subagents	Used occasionally	Primary execution model
toli's daily load	Initiates everything	Reads brief, makes 2-3 decisions
Self-improvement	Compound loops off	Nightly, automatic
Morning brief	Doesn't exist	Daily at 6am
Business coverage	Implicit in agent count	Explicit via context injection
Cost	$30-42/day (when running)	$8-15/day (all background processes running)

The Immediate Rebuild Sequence

For Soul Engineer to implement:

Day 1: Read this document. Internalize it. It is the operating thesis.

Day 2-3 (Soul rewrites first): Begin Core Four soul rewrites. P0 security items are deliberately deferred to Month 2 per toli's override.

Day 4-7 (Core Four souls):

Rewrite Lacie → The Architect (using the 10 soul principles above)
Rewrite Gary → The Builder
Rewrite Cherry → The Revenue Operator
Rewrite Jerry → The Operator
Archive the other 11 agent workspaces (they become specialists in the library)

Day 8-14 (Automation layer):

Enable all four Core Four heartbeats
Enable compound loops for all four
Enable morning brief cron (6am daily)
Wire Cherry → agentmail for outreach execution
Restore Barry as 5th agent (Bearish public persona)

Day 15-21 (Specialist library):

Convert the best specialist souls from the 11 archived agents into the subagent library
First full week with morning brief running
Measure first OKR baselines

Companion Documents

MISSION-CHARTER.md — organizational governance
OKRs-Q1-2026.md — quarterly objectives cascade
agent-cards/ — Core Four specifications
COMPOUND-LOOP-GUIDE.md — self-improvement implementation
SECURITY-HARDENING.md — P0/P1/P2/P3 security checklist
SOUL-ENGINEER-BRIEFING.md — SE implementation mandate
BARRY-AUTOMATON-DEPLOYMENT.md — Complete technical guide for deploying Barry on Conway Research Automaton

References

Soul Design Research:

Core Four Research:

Dynamic Subagent Architecture:

Background Processes:

The Autonomous Organization Thesis

OpenClaw v3: Core Four Architecture#

The Thesis (One Page)#

Part 1: What Makes an Agent Soul Superhuman#

Principle 1: Beliefs-as-Experience Outperforms Rule Lists#

Principle 2: The Soul Must Occupy the Primacy Position#

Principle 3: Multi-Expert Arbitration Beats Single Expert#

Principle 4: Psychological Groundedness Resists Manipulation and Drift#

Principle 5: Name the Productive Flaw#

Principle 6: Context Engineering Is the Primary Performance Lever#

Principle 7: Specific Anti-Patterns Block Generic Output#

Principle 8: Soul × Skill Is Multiplicative (3-5x)#

Principle 9: Experiential Framing Activates Cognitive Modes#

Principle 10: The Metacognitive Insert Prevents Quality Drift#

Part 2: The Core Four#

The Design Principle#

Agent 1: THE ARCHITECT#

Agent 2: THE BUILDER#

Agent 3: THE REVENUE OPERATOR#

Agent 4: THE OPERATOR#

Part 3: One Team or Per-Business?#

The Evidence#

When to Federate#

How Business-Specific Context Gets Injected#

Part 4: The Dynamic Subagent Architecture#

The Mental Model: Agent as Function Call#

The Five-Layer Spawn Package#

The Specialist Library#

Values Inherit, Identity Does Not#

When to Spawn vs Handle Inline#

Parallelization Decision#

Model Routing#

Subagents Cannot Spawn Subagents#

Part 5: The Background Process Stack#

The Principle: Event-Driven for Exceptions, Scheduled for Synthesis#

Always-On (Operator Agent)#

Every 30 Minutes: The Business Heartbeat (Operator Agent)#

Daily Schedule#

The Morning Brief (The Only Thing toli Must Read)#

Weekly Schedule#

Monthly Schedule#

The Compound Interest Loop (What Gets Better Automatically)#

Part 6: The Practical Interaction Model#

The One-Line Model#

The Routing Logic#

What toli Never Has to Do#

What toli Does Keep#

Part 7: The "Extra Agent" Pattern#

Part 8: The Meta-Model — Replication Across Any Business#

How to Bootstrap a New Business#

What Changes from v1 (17 Agents) to v2 (Core Four)#

The Immediate Rebuild Sequence#

Companion Documents#

References#

OpenClaw v3: Core Four Architecture

The Thesis (One Page)

Part 1: What Makes an Agent Soul Superhuman

Principle 1: Beliefs-as-Experience Outperforms Rule Lists

Principle 2: The Soul Must Occupy the Primacy Position

Principle 3: Multi-Expert Arbitration Beats Single Expert

Principle 4: Psychological Groundedness Resists Manipulation and Drift

Principle 5: Name the Productive Flaw

Principle 6: Context Engineering Is the Primary Performance Lever

Principle 7: Specific Anti-Patterns Block Generic Output

Principle 8: Soul × Skill Is Multiplicative (3-5x)

Principle 9: Experiential Framing Activates Cognitive Modes

Principle 10: The Metacognitive Insert Prevents Quality Drift

Part 2: The Core Four

The Design Principle

Agent 1: THE ARCHITECT

Agent 2: THE BUILDER

Agent 3: THE REVENUE OPERATOR

Agent 4: THE OPERATOR

Part 3: One Team or Per-Business?

The Evidence

When to Federate

How Business-Specific Context Gets Injected

Part 4: The Dynamic Subagent Architecture

The Mental Model: Agent as Function Call

The Five-Layer Spawn Package

The Specialist Library

Values Inherit, Identity Does Not

When to Spawn vs Handle Inline

Parallelization Decision

Model Routing

Subagents Cannot Spawn Subagents

Part 5: The Background Process Stack

The Principle: Event-Driven for Exceptions, Scheduled for Synthesis

Always-On (Operator Agent)

Every 30 Minutes: The Business Heartbeat (Operator Agent)

Daily Schedule

The Morning Brief (The Only Thing toli Must Read)

Weekly Schedule

Monthly Schedule

The Compound Interest Loop (What Gets Better Automatically)

Part 6: The Practical Interaction Model

The One-Line Model

The Routing Logic

What toli Never Has to Do

What toli Does Keep

Part 7: The "Extra Agent" Pattern

Part 8: The Meta-Model — Replication Across Any Business

How to Bootstrap a New Business

What Changes from v1 (17 Agents) to v2 (Core Four)

The Immediate Rebuild Sequence

Companion Documents

References