Platform Architecture
A layered, platform-portable architecture. Any MCP-capable orchestrator connects to the same tool server — same tools, same workflows, different model or UI. Bold Penguin's reference orchestrator is the Deep Agent, a production Python service built on deepagents and LangChain.
Design Principles
1. Platform Portable
Claude, ChatGPT, Gemini, or custom agentic app — all connect to the same MCP tool server. Change the orchestrator; tool flows stay the same. The Deep Agent itself is portable across providers — primary and fallback models are configured per-turn via LLM_PROVIDER_ORDER (default anthropic,openai,google_genai).
2. Per-Task Model Selection
Each pipeline stage uses the best model for the job. Structured subagents (MQS, CQS) run on a fast model (CLAUDE_FAST_MODEL, default claude-haiku-4-5). Reasoning-heavy subagents (planner, critique, mkt_intel, quote) run on the primary model (CLAUDE_MODEL, default claude-sonnet-4-6). Model choices are tool-internal and swappable, invisible to the orchestrator.
3. Deterministic Harness
LLMs extract and analyze. Deterministic code validates, scores, routes, and decides. The LLM never self-scores its own output. HITL gates are raised by explicit tool calls (request_* tools), not by prompt heuristics — the graph pauses when the tool fires, regardless of what the model wants to do next.
4. Human-in-the-Loop Gates
The Deep Agent exposes explicit pipeline stages where execution can pause and wait for broker input. The external SSE contract documents eight gate stages today; an additional two (awaiting_generate_quotes, awaiting_post_quote_actions) are defined in the PipelineStage enum and wired up in the HITL tool layer but are surfaced through the internal /chat/stream endpoint rather than the external SSE contract. The system never auto-proceeds past a gate; every resume carries an action_id or resume_options envelope back through the same streaming endpoint.
Layered Architecture
| Layer | Role | Components |
|---|---|---|
| Orchestrator | Platform portable | Any MCP-capable orchestrator — Claude, ChatGPT, Gemini, or custom agentic app. Bold Penguin's reference implementation is the Deep Agent (create_deep_agent) — a Python FastAPI service that coordinates seven specialized subagents. |
| Protocol | Standard interface | MCP (Model Context Protocol) over streamable HTTP, stdio, or AWS AgentCore IAM/SigV4. Tool definitions, argument schemas, and async task patterns are protocol-level. Any MCP client discovers and calls tools without custom integration. The Deep Agent's default client is MCP_CLIENT_TYPE=iam (AWS AgentCore SigV4); alternatives are a static MCP_BEARER_TOKEN or OAuth via BP_AUTH_URL + BP_API_KEY + CLIENT_ID + CLIENT_SECRET. A separate X-API-Key (API_KEY env var) secures the FastAPI endpoints. |
| Tool Layer | One or more MCP servers | Insurance Intelligence MCP Server (universal_mcp_server via AWS AgentCore) — primary server, multiple domains: enrichment, applications, market intelligence, document ingestion. Salesforce MCP (stdio via @tsmztech/mcp-server-salesforce) — optional CRM server configured in app/config/mcp_servers.json. A separate Policy Document Analyzer MCP exists in the broader platform (handles policy extraction / comparison) but is not currently bound to the Deep Agent. |
| Data & AI | Per-task models | Primary LLM (CLAUDE_MODEL), fast LLM (CLAUDE_FAST_MODEL), fallback providers (LLM_PROVIDER_ORDER), NLP / Critic / Analysis (cross-model adversarial review), Embeddings (local semantic search), Persistence (submissions, audit trail — Partner Engine is source of truth). |
Change the orchestrator — tool flows stay the same. Change the model per task — protocol stays the same. Change the UI — audit trail stays the same.
End-to-End Pipeline
The full pipeline runs through these stages, with red gates representing human-in-the-loop stop points where the system never auto-proceeds:
User Intent → 🔴 Plan Confirmation → Intake / Enrichment → 🔴 Critique Review → 🔴 Discrepancy Corrections → Application Create (MQS) → 🔴 Application Consent → Market Intelligence → 🔴 Carrier Selection → 🔴 Carrier Confirmation → Carrier Questions (CQS) → 🔴 Application Answers → 🔴 Quote Selection → 🔴 Generate Quotes → Quoting → 🔴 Post-Quote Actions → Quote Summary
Not every turn visits every gate — gates fire only when the agent needs a human decision. For example, awaiting_user_input fires only when the critique agent surfaces discrepancies; awaiting_application_answers fires only when MQS/CQS exhaust automated answer strategies; and awaiting_quote_selection only fires when CQS identifies carriers with open questions (if all carriers are auto-ready the pipeline goes straight to awaiting_generate_quotes). awaiting_post_quote_actions pauses after quotes come back so the broker can compare, request more carriers, or edit enriched data.
Agent Details
Full pipeline reference — agents, descriptions, MCP tools, products, and services. The Deep Agent itself is the coordinator; the table below lists the seven production subagents it delegates to.
| Step | Agent | Description | MCP Tools | Products | Services / APIs |
|---|---|---|---|---|---|
| 1 — Planning | planner_agent | Runs first on every new request. Analyzes the user's intent and produces a structured step-by-step execution plan. Not re-run on HITL resume — plan persists in state.plan. | write_plan_to_state_tool | — | — |
| 2 — Intake | intake_agent | Business data enrichment and document extraction. Given company name + address, fetches NAICS codes, financials, and contacts. Supports direct-upload and S3 ingestion. Generates enriched_data and an enriched_data_summary (~500-token compact summary used by downstream agents). | enrich_company_data, do_document_ingestion_from_s3, initiate_document_upload, do_document_submission_by_tx_id, do_data_inquiry | SubmissionLink | Universal Submit API, Submission Status Inquiry API, Data Inquiry API, Company Submit API, Location Submit API |
| 3 — Critique | critique_agent | Two modes. Mode A: Reviews enriched and extracted data and surfaces issues for user review before the pipeline proceeds. Mode B: Processes user corrections or approval and writes an outcome back to state. Always writes to artifacts.data_quality_checks. | read_intake_enriched_data_tool, read_intake_extracted_data_tool, write_critique_to_state_tool | SubmissionLink, JackIQ | JackIQ API |
| 4 — Master Questions | mqs_agent | Creates the application and fills master (common) questions. Uses a 3-tier answer strategy: (1) find_answers from submission data, (2) intelligent guesses, (3) request_user_answers as last resort. Runs on the fast model (Haiku) when primary provider is Anthropic; falls back to CLAUDE_MODEL / OPENAI_MODEL / GOOGLE_MODEL when not. | create_application, update_application, find_incomplete_master_question, find_answers, find_business_type, find_carrier_class, get_application_summary | PartnerEngine, Terminal | Partner Engine API |
| 5 — Market Intel | mkt_intel_agent | Predicts which carriers are likely to bind using market intelligence API. Writes carriers_likely_to_bind and mkt_intel_data to state. | get_market_intelligence (plus get_market_recommendation, currently disabled) | Market Intelligence | Market Intelligence API, Market Recommendation API |
| 6 — Carrier Questions | cqs_agent | Fills carrier-specific questions for carriers confirmed by the user. Only processes questions scoped to confirmed carriers. Uses same 3-tier answer strategy as mqs_agent. Produces cqs_summary (carriers_with_questions / carriers_auto) so the main agent knows whether to raise awaiting_quote_selection. Also runs on the fast model (same non-Anthropic fallback as MQS). | find_incomplete_questions_by_carrier, find_answers, update_application, find_business_type, find_carrier_class | CarrierEngine, PartnerEngine, Terminal | Partner Engine API |
| 7 — Quote Summary | quote_agent | Reads quote_status and quote_responses from state, fetches live application summary if needed, and produces a broker-friendly markdown quote table (Carrier / Status / Premium / Limits / Deductible). | get_application_summary | ClauseLink, PartnerEngine | Clause Link API |
Subagents are defined two ways: code factories in app/agents/ and JSON definitions in app/config/subagents.json. JSON definitions, when present, override the code factory for the same subagent key — this lets prompts and tool lists be tuned without redeploying.
Orchestrator: the Deep Agent
The Deep Agent is the reference orchestrator that drives the pipeline above.
Middleware Stack
create_deep_agent is invoked once for the main agent and supplies a default middleware stack (filesystem, summarization, prompt caching, patched tool calls, todo tracking, subagent dispatch). The Deep Agent extends that default with one additional middleware:
ModelFallbackMiddleware— on primary-model error, re-runs the call against the next provider inLLM_PROVIDER_ORDER. Only added when at least one fallback model is configured.
Each subagent built by _build_subagent gets its own explicit stack, applied in this order:
TodoListMiddleware— tracks plan progress as structured todos.FilesystemMiddleware— exposes a scratchpad filesystem tool.SummarizationMiddleware— compresses older messages when the window exceeds 50K tokens, keeping the most recent 6 messages verbatim. Defaults are patched at module load so the trigger fires at this budget rather than the library default.AnthropicPromptCachingMiddleware— attaches Anthropic prompt-caching headers so the static system prompt + tool definitions are cached across turns. Configured withunsupported_model_behavior="ignore"so non-Anthropic models pass through unchanged.PatchToolCallsMiddleware— normalizes tool-call IDs across providers so LangGraph can restart after a HITL pause without loss of trace context.
Subagents are compiled with recursion_limit=100.
Session Storage
Run state is persisted between turns so a HITL-paused run can resume on the next request. The backend is pluggable via PERSISTENCE_LAYER:
filesystem— JSON files underAGENT_STATE_DIR(.agent_state/{run_id}.json).mongodb— MongoDB collections viaSMARTDATA_MONGO_CLUSTER_URL(production default).mongodb_and_filesystem— both, with Mongo as primary and filesystem as a local cache.
Side-file data (HITL requests, tool outputs staged for the next turn) is written via get_persistence_provider().write_side_data(...) into a tool_data collection keyed by run_id + side_key.
Run State Schema
Each run is a Pydantic AgentRunState model serialized to a nested dict via to_dict():
run_id, timestamp, thread_id, user_request
chat_history, enriched_data, enriched_data_summary, extracted_data
artifacts:
risk_profile, firmographics, data_quality_report, data_quality_checks
carriers_likely_to_bind, mkt_intel
application: { state, data, mqs_data, cqs_data, eligible_carriers }
quotes: { status, responses }
plan: { steps, summary, created_at, todos, current_step, status }
execution_log
pipeline:
submission_state, carrier_selection, carriers_for_quoting, user_provided_answers
field_audit, message_history, event_log, user_info, errors, cost_tracking
application_state is one of none | mqs_complete | ready_for_quoting. submission_state.pipeline_stage is the current HITL stage (see Pipeline Stages).
Streaming Contract
All production traffic flows through the Deep Agent's external streaming SSE endpoint. External integrators reach this via the A2A protocol rather than calling it directly; the request envelope is:
{
"message": "string",
"run_id": "string | null",
"resume_options": "ResumeOptions | null",
"action_id": "string | null",
"payload": {}
}
action_id is the preferred resume mechanism — the server maps confirm_plan, approve_critique, reply_critique, provide_corrections, confirm_application, select_all, select_specific, submit_answers, get_quotes, and bind / download / done to the equivalent resume_options shape.
SSE Events
| Event | Purpose |
|---|---|
run_id | Emitted immediately — clients persist this to resume on the next turn |
content | Per-token LLM output from the main agent |
subagent_content | Per-token LLM output from a subagent (tagged with agent) |
mkt_intel | Carriers-likely-to-bind list written to state |
application | Application / MQS checkpoint |
eligible_carriers | Eligible-carrier list written to state |
quotes | Quote status + responses |
usage | Per-call token counts |
state | Incremental message snapshot |
hitl_pause | Pipeline paused at a HITL gate — carries pipeline_stage, resume_hint, and ui_component |
cost | Turn cost summary (by_model + totals) — emitted before done |
done | Final payload with full run_state |
error | Terminal failure |
Pipeline Stages
The full PipelineStage enum in app/models/submission.py defines ten awaiting_* pause states. The first eight are emitted on hitl_pause and done by the external SSE contract; the last two currently surface only via the internal /chat/stream endpoint (and are visible on AgentRunState.submission_state.pipeline_stage either way):
pipeline_stage | When | HITL tool | Required resume |
|---|---|---|---|
awaiting_plan_confirmation | Planner proposes an execution plan | request_plan_confirmation | { confirm_plan: true } |
awaiting_critique_review | Critique agent presents findings for review | request_critique_review | { critique_reply: "<feedback>" } or "" to approve |
awaiting_user_input | Critique found discrepancies and requests corrections | request_discrepancy_review | { corrections: { <field>: <value> } } or { proceed_without_corrections: true } |
awaiting_application_consent | Broker consent required before submitting application | request_application_consent | { confirm_application: true } |
awaiting_carrier_selection | User selects carriers from the MI list | request_carrier_selection | { carrier_choice: "all" } or { carrier_choice: "specific", carriers: [...] } |
awaiting_carrier_confirmation | User confirms the eligible-carrier list | request_carrier_confirmation | same as carrier_selection |
awaiting_application_answers | MQS/CQS need answers to specific questions | request_user_answers | { application_answers: { <question_code>: <answer> } } |
awaiting_quote_selection | CQS identified carriers with open questions; user selects which to quote | request_quote_selection | { carrier_choice: "specific", carriers: [...] } |
awaiting_generate_quotes (internal) | CQS complete; broker confirms "generate quotes" | request_quote_generation | proceed message / --get-quotes action |
awaiting_post_quote_actions (internal) | Quotes are back; broker chooses compare / more carriers / edit enriched data | request_post_quote_actions | POST_QUOTE_ACTION: directive in follow-up message |
Each hitl_pause event carries a self-describing ui_component spec (component name, gate_type, title, description, props, actions[]). Frontends render the gate from that spec instead of hardcoding component logic.
Use Case Flows
UC1 — Quote a New Risk
Legend: 🟤 Input · 🟣 Planning · 🔵 Enrichment · 🟠 Critique · 🔷 Application · 🟦 Market Intelligence · 🔴 HITL Gates
UC2 — Quote Comparison (Platform-adjacent, not Deep Agent)
The UC2 flow is implemented by the broader Bold Penguin platform via the Policy Document Analyzer MCP server and the submission inventory — it is not currently bound to the Deep Agent. An orchestrator wishing to run this flow today connects directly to the Policy Document Analyzer MCP.
Legend: 🔷 Policy Documents · 🟠 Policy Extraction (async) · 🟢 Comparison & Report · 🟤 UC1 Digital Quotes
Pipeline Deep Dives
The two sections below describe capabilities in the broader Bold Penguin platform. They run on separate MCP servers and pipelines that the Deep Agent does not currently bind to. They are retained here because orchestrators other than the Deep Agent (and, potentially, future Deep Agent releases) compose them with the Deep Agent pipeline.
Policy Document Extraction (Policy Document Analyzer MCP)
The extraction pipeline runs through these stages in sequence:
- PDF Reader — pdfplumber + pypdfium2 — raw text + table structures
- Document Splitter — statistical page scoring + pinning (declarations, endorsements, SOF) — 100K char budget
- LLM Extraction — 30+ rule system prompt — model-per-task (Sonnet, Gemini Pro)
- Deterministic Validation — Pydantic schema + arithmetic + cross-reference scrubbing
- Critic Agent — different foundation model reviews adversarially for missed fields and hallucinations
- Deterministic Scoring — auto-fix (high confidence) / flag for review / escalate
- Output — PolicyExtraction JSON + validation + audit trail
Smart Document Splitting
- 100K character budget with statistical page scoring — 200-page policies trimmed before LLM sees them
- Declarations pinning — 3-tier header detection + continuation by density score
- Endorsement pinning — header + preamble detection, body continuation (5pg/endorsement cap, 30 total)
- Schedule of Forms always included as the authoritative manifest
Critic Agent (Cross-Model Adversarial)
- Never self-scores — a separate foundation model critiques the extraction
- Deterministic checks first — SOF completeness (form count vs extracted forms), cross-reference integrity
- Per-field confidence drives auto-fix vs. flag vs. escalate
- Source quote verification — substring match against original text chunks confirms evidence
Audit Trail
- Source page citations per extracted field
- Validation results — errors, warnings, hints by severity
- Critic feedback — what was found, auto-fixed, or flagged
Submission Processing Pipeline (Submission Link, upstream of the Deep Agent)
The pipeline below feeds data into the Deep Agent's intake_agent via enrich_company_data and do_data_inquiry; the trust/consensus/ontology work itself happens upstream in the Submission Link platform, not inside the Deep Agent.
- Document Extraction — structured data from ACORDs, carrier apps, loss runs, SOVs
- Company Enrichment — 50+ OOTB sources — NAICS, revenue, employees, legal entity
- Trust Scoring — proprietary per-source reliability scoring
- Consensus Detection — cross-source agreement algorithm — 3+ sources align = high confidence
- Critique Gating (HITL) — findings surfaced to broker as an
awaiting_critique_revieworawaiting_user_inputpause — never auto-submitted - Ontology Mapping (Deterministic) — 153+ PE answer codes to canonical fields
- Question Set Matching — canonical fields to ApplicationForm question codes
- Answer Provenance Tagging —
answered_by_type+answered_by_sourceon every answer - Completeness Loop — find gaps, fill, surface unanswerable to broker (HITL via
awaiting_application_answers) - Output — complete application — quotes auto-fire to selected carriers
Trust and Consensus Algorithms
- Trust scoring — proprietary per-source reliability model assigns weights based on historical accuracy per field type
- Consensus detection — cross-source agreement: when extracted, enriched, and third-party values converge, confidence increases
- Triangulation — submission doc extraction + 50+ enrichment sources + third-party data compared and reconciled
- Per-field confidence output — every data point tagged with trust score + consensus level
Deterministic Ontology
- 153+ PE answer codes mapped to canonical SD Dictionary V0.86 field names
- Priority hierarchy —
ia_*_v3>mqs_*>bold_penguin_*when multiple sources match - Entity groups — locations, vehicles, drivers, owners, WC classes with composite key logic
- Mapping is a lookup table, not LLM inference
Answer Provenance
Every answer is traceable to its source:
| Provenance Type | Source |
|---|---|
submission_link_enriched | Third party data — from enrichment APIs |
submission_link_extracted | Document upload — from uploaded documents |
submission_link_defaulted | Defaulted answers — configured defaults |
MCP Tool Registry
The Deep Agent connects to one or more MCP servers at startup via get_mcp_tools() (configured in app/config/mcp_servers.json). Each subagent declares a filtered toolset; not every subagent sees every tool.
Insurance Intelligence MCP (bound to Deep Agent)
This is the primary tool server — the universal_mcp_server reached over AWS AgentCore IAM/SigV4 by default. Every tool below is used by at least one Deep Agent subagent.
| Tool | Domain | Used by | Description |
|---|---|---|---|
enrich_company_data | Enrichment | intake | Smart company profile — NAICS, financials, descriptions |
do_document_ingestion_from_s3 | Intake | intake | Async ingestion of a submission doc from S3 |
initiate_document_upload | Intake | intake | Obtain pre-signed upload URL for a submission doc |
do_document_submission_by_tx_id | Intake | intake | Finalize a document submission by transaction id |
do_data_inquiry | Intake | intake | Query insurance intelligence data by insured |
find_business_type | Enrichment | mqs, cqs | Search for business type information |
find_carrier_class | Enrichment | mqs, cqs | Search for carrier class codes |
create_application | Application | mqs | Create application form + trigger quote submission |
update_application | Application | mqs, cqs | Update answers on application form |
find_incomplete_master_question | Application | mqs | Find missing required master questions |
find_incomplete_questions_by_carrier | Application | cqs | Find missing carrier-specific questions |
find_answers | Application | mqs, cqs | Find answers for specific question codes |
get_application_summary | Application | mqs, quote | Application summary + quote request status |
get_market_intelligence | Market Intel | mkt_intel | Carrier-specific MI predictions (product/NAICS/state) |
get_market_recommendation | Market Intel | (disabled) | Carrier eligibility check — currently disabled in the pipeline |
validate_token | Auth | (auth layer) | Validate authentication token |
Salesforce MCP (optional, bound via stdio)
Configured in mcp_servers.json, pulled in when SALESFORCE_INSTANCE_URL + SALESFORCE_CLIENT_ID + SALESFORCE_CLIENT_SECRET are set. Exposes the standard Salesforce tool set (record CRUD, SOQL queries) provided by @tsmztech/mcp-server-salesforce. No Deep Agent subagent uses these tools in the default pipeline — they're available to any subagent that opts in.
Adjacent platform capabilities (separate MCP servers, not bound to the Deep Agent)
The following tools exist on the broader platform but are exposed by separate MCP servers. Orchestrators other than the Deep Agent (or a future Deep Agent release) can bind to them using the same MCP pattern; today they drive the UC2 policy-comparison flow and legacy submission processing, not the Deep Agent pipeline.
| Tool | Server | Purpose |
|---|---|---|
extract_policy_document, extract_quote_object, check_extraction_status, get_extraction_schema | Policy Document Analyzer | Async extraction of policy docs, carrier API quote objects, and schema lookup |
compare_extractions, generate_comparison_pdf, generate_comparison_html, save_comparison_pdf | Policy Document Analyzer | Normalize + compare extractions; render side-by-side HTML/PDF |
search_similar_submission | Submission Link | Search inventory by insured name (text or Mongo query) |
crm_upsert_opportunity | Custom CRM MCP (optional) | Upsert opportunity with recommendation metadata, report URL, carrier selection |
HITL Request Tools (orchestration-only, local to the Deep Agent)
LangChain-native tools that never call external APIs. Each stages a gate payload to a side file and causes the stream to emit hitl_pause (or, for the last two, returns a state transition visible on the internal /chat/stream endpoint):
request_plan_confirmation, request_critique_review, request_discrepancy_review, request_application_consent, request_carrier_selection, request_carrier_confirmation, request_user_answers, request_quote_generation, request_post_quote_actions.
request_user_answers is also bound to the mqs_agent and cqs_agent subagents so they can surface unanswerable questions without round-tripping through the main agent. request_quote_selection is defined in app/tools/hitl_state_tools.py and referenced in the v4/v5 system prompts but is not currently bound to the main agent or any subagent in code; the awaiting_quote_selection stage is set programmatically via build_submission_state_awaiting_quote_selection in app/sessions/hitl.py.
Context and State Architecture
The platform uses a three-tier state architecture to manage context efficiently.
Tier 1 — Conversation Context
- Tool chain = transaction log — each response carries IDs forward.
application_id,task_id, andorder_idare all durable and linked tosubmission_reference_idfor full traceability. - Summaries only — conversation sees carrier + premium + error count, not 10K-token JSONs. Intake writes a compact
enriched_data_summary(~500 tokens) that downstream agents read instead of the full payload. - Cache-safe prefix — static system prompt + tool defs (~15K tokens) cached via
AnthropicPromptCachingMiddlewareat up to 90% savings. - Per-UC system prompts — git-tracked
.mdfiles, never mutated mid-session. Prompt version pinned byPROMPT_VERSION(defaultv4).
Tier 2 — Server-Side State
- Full artifacts in
AgentRunState— extraction JSONs, critic results, MQS/CQS answers, quote responses keyed byrun_idbehind the MCP tools and the Deep Agent persistence layer. - Heavy data never transits conversation —
get_application_summaryand similar tools pull from server state. - Ephemeral working memory plus durable run state —
AgentRunState.message_historyreplays a prior session on resume;submission_state.pipeline_stagetells the pipeline where to pick up. - Per-task model calls — separate cache contexts behind MCP boundary.
- Summarization middleware — messages older than the keep-window are compressed when the conversation exceeds 50K tokens, preserving the last 6 turns verbatim.
Tier 3 — Durable Audit Trail
- Pluggable persistence —
PERSISTENCE_LAYER=mongodb | filesystem | mongodb_and_filesystem. Mongo is the production default; filesystem is the local-dev default. - MongoDB collections — runs,
tool_dataside files, extraction audits, critic decisions, CoT logs, token usage per step. - Cross-transaction memory — previous extractions and comparisons queryable by insured name across sessions.
- Async writes — fire-and-forget, never block the workflow.
submission_reference_id— cross-transaction durable key linking all activity for same insured.- Token cost reporting —
cost_trackingonAgentRunStateaccumulates token counts and USD cost per model across all turns of a run. ThecostSSE event emits the turn total just beforedone. - LangSmith tracing — optional, enabled via
LANGSMITH_API_KEY; defaults to projectarctic-agents.
Conversation carries references. Tools hold artifacts. The persistence layer holds the audit trail.
System Prompt Architecture
Orchestrator-Level Prompts
- Per-UC workflow prompts defining tool sequences and stop gates
- Static + versioned — git-tracked
.mdfiles, never mutated mid-session; version pinned byPROMPT_VERSION - Model-agnostic — same prompt drives Claude, ChatGPT, Gemini, or BP Agent
- JSON override —
app/config/subagents.jsonentries replace the code-factory prompt and tool list for a subagent without code changes
Tool-Internal Prompts
- Per-task prompts — structured extraction rules, critic verification, NLP parsing, triangulation logic
- Hidden behind MCP — orchestrator never sees them, separate cache context
- Dynamic injection — text chunks + deterministic hints into user prompts
- Model-specific — each prompt tuned for its assigned model — swap without changing workflow