Platform Architecture
A layered, platform-portable architecture. Any MCP-capable orchestrator connects to the same tool server — same tools, same workflows, different model or UI.
Design Principles
1. Platform Portable
Claude, ChatGPT, Gemini, or custom agentic app — all connect to the same MCP tool server. Change the orchestrator; tool flows stay the same.
2. Per-Task Model Selection
Each pipeline stage uses the best model for the job. Model choices are tool-internal and swappable, invisible to the orchestrator.
3. Deterministic Harness
LLMs extract and analyze. Deterministic code validates, scores, routes, and decides. The LLM never self-scores its own output.
4. Human-in-the-Loop Gates
Four explicit stop gates: Confidence Review, Carrier Selection, Missing Questions, and Completeness. The system never auto-proceeds past a gate.
Layered Architecture
| Layer | Role | Components |
|---|---|---|
| Orchestrator | Platform portable | Any MCP-capable orchestrator — Claude, ChatGPT, Gemini, or custom agentic app. All connect to the same MCP tool server. |
| Protocol | Standard interface | MCP (Model Context Protocol) — open standard. Tool definitions, argument schemas, and async task patterns are protocol-level. Any MCP client discovers and calls tools without custom integration. |
| Tool Layer | Single MCP server | Insurance Intelligence MCP Server — single server, multiple domains: extraction, enrichment, applications, market intelligence, and comparison. 30+ tools share one deployment. Optional CRM server (Salesforce, HubSpot, or any MCP-capable CRM). |
| Data & AI | Per-task models | Extraction LLM (Sonnet, Gemini Pro, GPT-4o — swappable behind MCP boundary), NLP / Critic / Analysis (cross-model adversarial review), Embeddings (local semantic search), Persistence (submissions, audit trail — Partner Engine is source of truth). |
Change the orchestrator — tool flows stay the same. Change the model per task — protocol stays the same. Change the UI — audit trail stays the same.
End-to-End Pipeline
The full pipeline runs through these stages, with red gates representing human-in-the-loop stop points where the system never auto-proceeds:
Submission Extract → Enrich → 🔴 Confidence Gate → Application → Market Intel → 🔴 Carrier Selection → Quoting → 🔴 Completeness → Policy Extract → Critic → Compare → Report
Agent Details
Full pipeline reference — agent roles, descriptions, MCP tools, products, and services.
| Step | Agent | Description | MCP Tools | Products | Services / APIs |
|---|---|---|---|---|---|
| 1 — Orchestration | orchestrator | Top-level coordinator. Delegates to all subagents in sequence, manages HITL gates, and drives the full insurance quoting pipeline. Never calls external APIs directly. | load_state | — | — |
| 2 — Planning | planner | Runs first on every new request. Analyzes the user's intent and produces a structured step-by-step execution plan. | — | — | — |
| 3 — Intake | intake_agent | Business data enrichment and document extraction. Given company name + address, fetches NAICS codes, financials, and contacts. Ingests S3 documents and polls for completion. Generates RiskProfile JSON. | enrichment_tool, extraction_tool | SubmissionLink | Universal Submit API, Submission Status Inquiry API, Data Inquiry API, Company Submit API, Location Submit API |
| 4 — Data Quality | critique_agent | Reviews enriched and extracted data and surfaces issues for user review before the pipeline proceeds. | company_profile_validation_tool | SubmissionLink, JackIQ | JackIQ API |
| 5 — Market Intel | market_intel_agent | Predicts which carriers are likely to bind using market intelligence API. Mode A: uses enriched_data from state. Mode B: uses standalone input. Outputs a markdown carrier likelihood table. | market_intelligence_tool, market_recommendations_tool | Market Intelligence | Market Intelligence API, Market Recommendation API |
| 6 — Create Application | mqs_agent | Creates the application and fills master (common) questions. Uses a 3-tier answer strategy: (1) find_answers from submission data, (2) intelligent guesses, (3) request_user_answers as last resort. Triggers request_carrier_confirmation gate. | create_application_tool, update_application_tool, find_incomplete_master_questions_tool, find_answers_tool, get_eligible_carriers_tool, find_business_type_tool, find_carrier_class_tool, get_application_summary_tool | PartnerEngine, Terminal | Partner Engine API |
| 7 — Get Quotes | cqs_agent | Fills carrier-specific questions for carriers confirmed by the user. Only processes questions scoped to confirmed carriers. Uses same 3-tier answer strategy as mqs_agent. | update_application_tool, find_incomplete_carrier_questions_tool, find_answers_tool, get_application_summary_tool | CarrierEngine, PartnerEngine, Terminal | Partner Engine API |
| 8 — Quote Summary | quote_summary_agent | Fetches live quote status from the application summary and produces a broker-friendly markdown quote table (Carrier / Status / Premium / Limits / Deductible). Also handles direct quote requests when app_state == ready_for_quoting. | get_application_tool, compare_quotes_tools, generate_quote_proposal_tool | ClauseLink | Clause Link API |
Use Case Flows
UC1 — Quote a New Risk
Legend: 🟤 Submission Extraction · 🔵 Enrichment (50+ sources) · 🟣 Triangulation (Trust & Consensus) · 🔷 Application · 🟦 Market Intelligence · 🔴 Stop Gates (HITL)
UC2 — Quote Comparison
Legend: 🔷 Policy Documents · 🟠 Policy Extraction (async) · 🟢 Comparison & Report · 🟤 UC1 Digital Quotes
Pipeline Deep Dives
Policy Document Extraction
The extraction pipeline runs through these stages in sequence:
- PDF Reader — pdfplumber + pypdfium2 — raw text + table structures
- Document Splitter — statistical page scoring + pinning (declarations, endorsements, SOF) — 100K char budget
- LLM Extraction — 30+ rule system prompt — model-per-task (Sonnet, Gemini Pro)
- Deterministic Validation — Pydantic schema + arithmetic + cross-reference scrubbing
- Critic Agent — different foundation model reviews adversarially for missed fields and hallucinations
- Deterministic Scoring — auto-fix (high confidence) / flag for review / escalate
- Output — PolicyExtraction JSON + validation + audit trail
Smart Document Splitting
- 100K character budget with statistical page scoring — 200-page policies trimmed before LLM sees them
- Declarations pinning — 3-tier header detection + continuation by density score
- Endorsement pinning — header + preamble detection, body continuation (5pg/endorsement cap, 30 total)
- Schedule of Forms always included as the authoritative manifest
Critic Agent (Cross-Model Adversarial)
- Never self-scores — a separate foundation model critiques the extraction
- Deterministic checks first — SOF completeness (form count vs extracted forms), cross-reference integrity
- Per-field confidence drives auto-fix vs. flag vs. escalate
- Source quote verification — substring match against original text chunks confirms evidence
Audit Trail
- Source page citations per extracted field
- Validation results — errors, warnings, hints by severity
- Critic feedback — what was found, auto-fixed, or flagged
Submission Processing Pipeline
- Document Extraction — structured data from ACORDs, carrier apps, loss runs, SOVs
- Company Enrichment — 50+ OOTB sources — NAICS, revenue, employees, legal entity
- Trust Scoring — proprietary per-source reliability scoring
- Consensus Detection — cross-source agreement algorithm — 3+ sources align = high confidence
- Confidence Gating (HITL) — low-confidence data surfaced to broker — never auto-submitted
- Ontology Mapping (Deterministic) — 153+ PE answer codes to canonical fields
- Question Set Matching — canonical fields to ApplicationForm question codes
- Answer Provenance Tagging —
answered_by_type+answered_by_sourceon every answer - Completeness Loop — find gaps, fill, surface unanswerable to broker (HITL)
- Output — complete application — quotes auto-fire to selected carriers
Trust and Consensus Algorithms
- Trust scoring — proprietary per-source reliability model assigns weights based on historical accuracy per field type
- Consensus detection — cross-source agreement: when extracted, enriched, and third-party values converge, confidence increases
- Triangulation — submission doc extraction + 50+ enrichment sources + third-party data compared and reconciled
- Per-field confidence output — every data point tagged with trust score + consensus level
Deterministic Ontology
- 153+ PE answer codes mapped to canonical SD Dictionary V0.86 field names
- Priority hierarchy —
ia_*_v3>mqs_*>bold_penguin_*when multiple sources match - Entity groups — locations, vehicles, drivers, owners, WC classes with composite key logic
- Mapping is a lookup table, not LLM inference
Answer Provenance
Every answer is traceable to its source:
| Provenance Type | Source |
|---|---|
submission_link_enriched | Third party data — from enrichment APIs |
submission_link_extracted | Document upload — from uploaded documents |
submission_link_defaulted | Defaulted answers — configured defaults |
MCP Tool Registry
| Tool | Domain | Category | Description |
|---|---|---|---|
extract_and_triangulate_submission_documents | Extraction | Extraction | Extract from submission docs (ACORDs, apps, loss runs, SOVs) + triangulate against 50+ enrichment sources |
extract_policy_document | Extraction | Extraction | Extract structured data from policy documents — expiring policies, offline quotes, binders (async) |
extract_quote_object | Extraction | Extraction | Extract from carrier API JSON/XML quote responses (async) |
check_extraction_status | Extraction | Extraction | Poll async extraction task until complete |
compare_extractions | Comparison | Comparison | Normalize + compare list of PolicyExtraction dicts |
generate_comparison_pdf | Comparison | Render | Side-by-side PDF comparison grid (base64) |
generate_comparison_html | Comparison | Render | Self-contained HTML comparison report |
save_comparison_pdf | Comparison | Render | Write base64 PDF to disk |
get_extraction_schema | Extraction | Schema | Return PolicyExtraction + ComparisonSummary JSON schemas |
get_protocol | Workflow | Protocol | Tool sequence, common pitfalls, workflow guidance |
enrich_company_data | Enrichment | Enrichment | Smart company profile — NAICS, financials, descriptions |
submit_company_profile | Enrichment | Enrichment | Submit company info for enrichment processing |
create_application | Application | Application | Create application form + trigger quote submission |
update_application | Application | Application | Update answers on application form |
find_incomplete_questions | Application | Application | Find missing required questions in application |
find_answers | Application | Application | Find answers for specific question codes |
get_application_summary | Application | Application | Application summary + quote request status |
get_market_intelligence | Market Intel | Market Intel | Carrier-specific MI predictions (product/NAICS/state) |
get_market_recommendation | Market Intel | Market Intel | Carrier eligibility check for product/NAICS/state |
find_business_type | Enrichment | Enrichment | Search for business type information |
find_carrier_class | Enrichment | Enrichment | Search for carrier class codes |
search_similar_submission | Search | Search | Search inventory (text or MongoDB query) |
validate_token | Auth | Auth | Validate authentication token |
crm_upsert_opportunity | CRM (Optional) | CRM | Upsert opportunity with recommendation metadata, report URL, carrier selection |
Context and State Architecture
The platform uses a three-tier state architecture to manage context efficiently.
Tier 1 — Conversation Context
- Tool chain = transaction log — each response carries IDs forward.
application_id,task_id, andorder_idare all durable and linked tosubmission_reference_idfor full traceability - Summaries only — conversation sees carrier + premium + error count, not 10K-token JSONs
- Cache-safe prefix — static system prompt + tool defs (~15K tokens) cached at up to 90% savings
- Per-UC system prompts — git-tracked
.mdfiles, never mutated mid-session
Tier 2 — Server-Side State
- Full artifacts in memory — extraction JSONs, critic results keyed by
task_idbehind MCP tools - Heavy data never transits conversation —
compare_extractionspulls from server state - Ephemeral — working memory for the session, not a persistence layer
- Per-task model calls — separate cache contexts behind MCP boundary
Tier 3 — Durable Audit Trail
- MongoDB collections — extractions, critic decisions, CoT logs, token usage per step
- Cross-transaction memory — previous extractions and comparisons queryable by insured name across sessions
- Async writes — fire-and-forget, never blocks the workflow
submission_reference_id— cross-transaction durable key linking all activity for same insured- Token cost reporting — per-step token accounting by model, cost-per-UC, cost-per-extraction
Conversation carries references. Tools hold artifacts. MongoDB holds the audit trail.
System Prompt Architecture
Orchestrator-Level Prompts
- Per-UC workflow prompts defining tool sequences and stop gates
- Static + versioned — git-tracked
.mdfiles, never mutated mid-session - Model-agnostic — same prompt drives Claude, ChatGPT, Gemini, or BP Agent
get_protocol()returns tool sequence + common pitfalls as just-in-time reinforcement
Tool-Internal Prompts
- Per-task prompts — structured extraction rules, critic verification, NLP parsing, triangulation logic
- Hidden behind MCP — orchestrator never sees them, separate cache context
- Dynamic injection — text chunks + deterministic hints into user prompts
- Model-specific — each prompt tuned for its assigned model — swap without changing workflow