Skip to main content

Platform Architecture

A layered, platform-portable architecture. Any MCP-capable orchestrator connects to the same tool server — same tools, same workflows, different model or UI.

Design Principles

1. Platform Portable

Claude, ChatGPT, Gemini, or custom agentic app — all connect to the same MCP tool server. Change the orchestrator; tool flows stay the same.

2. Per-Task Model Selection

Each pipeline stage uses the best model for the job. Model choices are tool-internal and swappable, invisible to the orchestrator.

3. Deterministic Harness

LLMs extract and analyze. Deterministic code validates, scores, routes, and decides. The LLM never self-scores its own output.

4. Human-in-the-Loop Gates

Four explicit stop gates: Confidence Review, Carrier Selection, Missing Questions, and Completeness. The system never auto-proceeds past a gate.

Layered Architecture

LayerRoleComponents
OrchestratorPlatform portableAny MCP-capable orchestrator — Claude, ChatGPT, Gemini, or custom agentic app. All connect to the same MCP tool server.
ProtocolStandard interfaceMCP (Model Context Protocol) — open standard. Tool definitions, argument schemas, and async task patterns are protocol-level. Any MCP client discovers and calls tools without custom integration.
Tool LayerSingle MCP serverInsurance Intelligence MCP Server — single server, multiple domains: extraction, enrichment, applications, market intelligence, and comparison. 30+ tools share one deployment. Optional CRM server (Salesforce, HubSpot, or any MCP-capable CRM).
Data & AIPer-task modelsExtraction LLM (Sonnet, Gemini Pro, GPT-4o — swappable behind MCP boundary), NLP / Critic / Analysis (cross-model adversarial review), Embeddings (local semantic search), Persistence (submissions, audit trail — Partner Engine is source of truth).
Key Insight

Change the orchestrator — tool flows stay the same. Change the model per task — protocol stays the same. Change the UI — audit trail stays the same.

End-to-End Pipeline

The full pipeline runs through these stages, with red gates representing human-in-the-loop stop points where the system never auto-proceeds:

Submission ExtractEnrich → 🔴 Confidence GateApplicationMarket Intel → 🔴 Carrier SelectionQuoting → 🔴 CompletenessPolicy ExtractCriticCompareReport

Agent Details

Full pipeline reference — agent roles, descriptions, MCP tools, products, and services.

StepAgentDescriptionMCP ToolsProductsServices / APIs
1 — OrchestrationorchestratorTop-level coordinator. Delegates to all subagents in sequence, manages HITL gates, and drives the full insurance quoting pipeline. Never calls external APIs directly.load_state
2 — PlanningplannerRuns first on every new request. Analyzes the user's intent and produces a structured step-by-step execution plan.
3 — Intakeintake_agentBusiness data enrichment and document extraction. Given company name + address, fetches NAICS codes, financials, and contacts. Ingests S3 documents and polls for completion. Generates RiskProfile JSON.enrichment_tool, extraction_toolSubmissionLinkUniversal Submit API, Submission Status Inquiry API, Data Inquiry API, Company Submit API, Location Submit API
4 — Data Qualitycritique_agentReviews enriched and extracted data and surfaces issues for user review before the pipeline proceeds.company_profile_validation_toolSubmissionLink, JackIQJackIQ API
5 — Market Intelmarket_intel_agentPredicts which carriers are likely to bind using market intelligence API. Mode A: uses enriched_data from state. Mode B: uses standalone input. Outputs a markdown carrier likelihood table.market_intelligence_tool, market_recommendations_toolMarket IntelligenceMarket Intelligence API, Market Recommendation API
6 — Create Applicationmqs_agentCreates the application and fills master (common) questions. Uses a 3-tier answer strategy: (1) find_answers from submission data, (2) intelligent guesses, (3) request_user_answers as last resort. Triggers request_carrier_confirmation gate.create_application_tool, update_application_tool, find_incomplete_master_questions_tool, find_answers_tool, get_eligible_carriers_tool, find_business_type_tool, find_carrier_class_tool, get_application_summary_toolPartnerEngine, TerminalPartner Engine API
7 — Get Quotescqs_agentFills carrier-specific questions for carriers confirmed by the user. Only processes questions scoped to confirmed carriers. Uses same 3-tier answer strategy as mqs_agent.update_application_tool, find_incomplete_carrier_questions_tool, find_answers_tool, get_application_summary_toolCarrierEngine, PartnerEngine, TerminalPartner Engine API
8 — Quote Summaryquote_summary_agentFetches live quote status from the application summary and produces a broker-friendly markdown quote table (Carrier / Status / Premium / Limits / Deductible). Also handles direct quote requests when app_state == ready_for_quoting.get_application_tool, compare_quotes_tools, generate_quote_proposal_toolClauseLinkClause Link API

Use Case Flows

UC1 — Quote a New Risk

flowchart TD IN(["Company Name + Address, plus or minus Submission Docs"]):::input EXT["extract_and_triangulate_submission_documents (ACORDs, apps, loss runs, SOVs)"]:::extract ENR["enrich_company_data (50+ OOTB sources)"]:::enrich TRI["Triangulate and Score (Trust, Consensus)"]:::triangulate GC{{"CONFIDENCE REVIEW: Broker confirms low-confidence data points"}}:::gate APP["create_application"]:::app MI["get_market_intelligence"]:::mi G1{{"CARRIER SELECTION: User picks carriers"}}:::gate FIQ["find_incomplete_questions, find_answers, update_application"]:::app GMQ{{"MISSING QUESTIONS: Broker answers what agent cannot"}}:::gate GAS["get_application_summary (poll)"]:::app G2{{"COMPLETENESS: All Quotes Resolved?"}}:::gate OUT(["Digital Quote Objects"]):::output IN -- "with docs" --> EXT IN -- "no docs" --> ENR EXT --> TRI ENR --> TRI TRI --> GC --> APP --> MI --> G1 G1 --> FIQ --> GMQ --> GAS --> G2 G2 --> OUT FIQ -. "loop until complete" .-> FIQ GMQ -. "unanswerable? surface to broker" .-> GMQ GAS -. "poll until resolved" .-> GAS classDef input fill:#92400e,color:#fef3c7,stroke:#b45309,stroke-width:2px classDef enrich fill:#0284c7,color:#fff,stroke:#0369a1,stroke-width:1px classDef app fill:#1d4ed8,color:#fff,stroke:#1e40af,stroke-width:1px classDef mi fill:#0891b2,color:#fff,stroke:#0e7490,stroke-width:1px classDef gate fill:#dc2626,color:#fff,stroke:#b91c1c,stroke-width:2px classDef extract fill:#d97706,color:#fff,stroke:#b45309,stroke-width:1px classDef output fill:#065f46,color:#d1fae5,stroke:#047857,stroke-width:2px classDef triangulate fill:#7c3aed,color:#fff,stroke:#6d28d9,stroke-width:2px

Legend: 🟤 Submission Extraction · 🔵 Enrichment (50+ sources) · 🟣 Triangulation (Trust & Consensus) · 🔷 Application · 🟦 Market Intelligence · 🔴 Stop Gates (HITL)

UC2 — Quote Comparison

flowchart TD DOCS(["Policy Documents\n(expiring policies, offline quotes,\nbinders, dec pages)"]):::input UC1(["UC1 Digital\nQuote Objects"]):::uc1 EXT["extract_policy_document\n(async, per doc)"]:::extract POLL["check_extraction_status\n(poll)"]:::extract PE["PolicyExtraction\nJSON(s)"]:::data CMP["compare_extractions"]:::compare RND["Quote Recommendation\nReport"]:::compare OUT(["Recommendation\nReport"]):::output DOCS --> EXT --> POLL --> PE DOCS -. "insured info seeds\napplication for digital quotes" .-> UC1 UC1 -. "digital quotes flow\nfrom UC1 quoting" .-> PE PE --> CMP --> RND --> OUT classDef input fill:#1e3a5f,color:#93c5fd,stroke:#1d4ed8,stroke-width:2px classDef extract fill:#d97706,color:#fff,stroke:#b45309,stroke-width:1px classDef data fill:#f3f4f6,color:#374151,stroke:#d1d5db,stroke-width:1px classDef compare fill:#059669,color:#fff,stroke:#047857,stroke-width:1px classDef output fill:#065f46,color:#d1fae5,stroke:#047857,stroke-width:2px classDef uc1 fill:#92400e,color:#fef3c7,stroke:#b45309,stroke-width:2px

Legend: 🔷 Policy Documents · 🟠 Policy Extraction (async) · 🟢 Comparison & Report · 🟤 UC1 Digital Quotes


Pipeline Deep Dives

Policy Document Extraction

The extraction pipeline runs through these stages in sequence:

  1. PDF Reader — pdfplumber + pypdfium2 — raw text + table structures
  2. Document Splitter — statistical page scoring + pinning (declarations, endorsements, SOF) — 100K char budget
  3. LLM Extraction — 30+ rule system prompt — model-per-task (Sonnet, Gemini Pro)
  4. Deterministic Validation — Pydantic schema + arithmetic + cross-reference scrubbing
  5. Critic Agent — different foundation model reviews adversarially for missed fields and hallucinations
  6. Deterministic Scoring — auto-fix (high confidence) / flag for review / escalate
  7. Output — PolicyExtraction JSON + validation + audit trail

Smart Document Splitting

  • 100K character budget with statistical page scoring — 200-page policies trimmed before LLM sees them
  • Declarations pinning — 3-tier header detection + continuation by density score
  • Endorsement pinning — header + preamble detection, body continuation (5pg/endorsement cap, 30 total)
  • Schedule of Forms always included as the authoritative manifest

Critic Agent (Cross-Model Adversarial)

  • Never self-scores — a separate foundation model critiques the extraction
  • Deterministic checks first — SOF completeness (form count vs extracted forms), cross-reference integrity
  • Per-field confidence drives auto-fix vs. flag vs. escalate
  • Source quote verification — substring match against original text chunks confirms evidence

Audit Trail

  • Source page citations per extracted field
  • Validation results — errors, warnings, hints by severity
  • Critic feedback — what was found, auto-fixed, or flagged

Submission Processing Pipeline

  1. Document Extraction — structured data from ACORDs, carrier apps, loss runs, SOVs
  2. Company Enrichment — 50+ OOTB sources — NAICS, revenue, employees, legal entity
  3. Trust Scoring — proprietary per-source reliability scoring
  4. Consensus Detection — cross-source agreement algorithm — 3+ sources align = high confidence
  5. Confidence Gating (HITL) — low-confidence data surfaced to broker — never auto-submitted
  6. Ontology Mapping (Deterministic) — 153+ PE answer codes to canonical fields
  7. Question Set Matching — canonical fields to ApplicationForm question codes
  8. Answer Provenance Tagginganswered_by_type + answered_by_source on every answer
  9. Completeness Loop — find gaps, fill, surface unanswerable to broker (HITL)
  10. Output — complete application — quotes auto-fire to selected carriers

Trust and Consensus Algorithms

  • Trust scoring — proprietary per-source reliability model assigns weights based on historical accuracy per field type
  • Consensus detection — cross-source agreement: when extracted, enriched, and third-party values converge, confidence increases
  • Triangulation — submission doc extraction + 50+ enrichment sources + third-party data compared and reconciled
  • Per-field confidence output — every data point tagged with trust score + consensus level

Deterministic Ontology

  • 153+ PE answer codes mapped to canonical SD Dictionary V0.86 field names
  • Priority hierarchyia_*_v3 > mqs_* > bold_penguin_* when multiple sources match
  • Entity groups — locations, vehicles, drivers, owners, WC classes with composite key logic
  • Mapping is a lookup table, not LLM inference

Answer Provenance

Every answer is traceable to its source:

Provenance TypeSource
submission_link_enrichedThird party data — from enrichment APIs
submission_link_extractedDocument upload — from uploaded documents
submission_link_defaultedDefaulted answers — configured defaults

MCP Tool Registry

ToolDomainCategoryDescription
extract_and_triangulate_submission_documentsExtractionExtractionExtract from submission docs (ACORDs, apps, loss runs, SOVs) + triangulate against 50+ enrichment sources
extract_policy_documentExtractionExtractionExtract structured data from policy documents — expiring policies, offline quotes, binders (async)
extract_quote_objectExtractionExtractionExtract from carrier API JSON/XML quote responses (async)
check_extraction_statusExtractionExtractionPoll async extraction task until complete
compare_extractionsComparisonComparisonNormalize + compare list of PolicyExtraction dicts
generate_comparison_pdfComparisonRenderSide-by-side PDF comparison grid (base64)
generate_comparison_htmlComparisonRenderSelf-contained HTML comparison report
save_comparison_pdfComparisonRenderWrite base64 PDF to disk
get_extraction_schemaExtractionSchemaReturn PolicyExtraction + ComparisonSummary JSON schemas
get_protocolWorkflowProtocolTool sequence, common pitfalls, workflow guidance
enrich_company_dataEnrichmentEnrichmentSmart company profile — NAICS, financials, descriptions
submit_company_profileEnrichmentEnrichmentSubmit company info for enrichment processing
create_applicationApplicationApplicationCreate application form + trigger quote submission
update_applicationApplicationApplicationUpdate answers on application form
find_incomplete_questionsApplicationApplicationFind missing required questions in application
find_answersApplicationApplicationFind answers for specific question codes
get_application_summaryApplicationApplicationApplication summary + quote request status
get_market_intelligenceMarket IntelMarket IntelCarrier-specific MI predictions (product/NAICS/state)
get_market_recommendationMarket IntelMarket IntelCarrier eligibility check for product/NAICS/state
find_business_typeEnrichmentEnrichmentSearch for business type information
find_carrier_classEnrichmentEnrichmentSearch for carrier class codes
search_similar_submissionSearchSearchSearch inventory (text or MongoDB query)
validate_tokenAuthAuthValidate authentication token
crm_upsert_opportunityCRM (Optional)CRMUpsert opportunity with recommendation metadata, report URL, carrier selection

Context and State Architecture

The platform uses a three-tier state architecture to manage context efficiently.

Tier 1 — Conversation Context

  • Tool chain = transaction log — each response carries IDs forward. application_id, task_id, and order_id are all durable and linked to submission_reference_id for full traceability
  • Summaries only — conversation sees carrier + premium + error count, not 10K-token JSONs
  • Cache-safe prefix — static system prompt + tool defs (~15K tokens) cached at up to 90% savings
  • Per-UC system prompts — git-tracked .md files, never mutated mid-session

Tier 2 — Server-Side State

  • Full artifacts in memory — extraction JSONs, critic results keyed by task_id behind MCP tools
  • Heavy data never transits conversationcompare_extractions pulls from server state
  • Ephemeral — working memory for the session, not a persistence layer
  • Per-task model calls — separate cache contexts behind MCP boundary

Tier 3 — Durable Audit Trail

  • MongoDB collections — extractions, critic decisions, CoT logs, token usage per step
  • Cross-transaction memory — previous extractions and comparisons queryable by insured name across sessions
  • Async writes — fire-and-forget, never blocks the workflow
  • submission_reference_id — cross-transaction durable key linking all activity for same insured
  • Token cost reporting — per-step token accounting by model, cost-per-UC, cost-per-extraction
note

Conversation carries references. Tools hold artifacts. MongoDB holds the audit trail.

System Prompt Architecture

Orchestrator-Level Prompts

  • Per-UC workflow prompts defining tool sequences and stop gates
  • Static + versioned — git-tracked .md files, never mutated mid-session
  • Model-agnostic — same prompt drives Claude, ChatGPT, Gemini, or BP Agent
  • get_protocol() returns tool sequence + common pitfalls as just-in-time reinforcement

Tool-Internal Prompts

  • Per-task prompts — structured extraction rules, critic verification, NLP parsing, triangulation logic
  • Hidden behind MCP — orchestrator never sees them, separate cache context
  • Dynamic injection — text chunks + deterministic hints into user prompts
  • Model-specific — each prompt tuned for its assigned model — swap without changing workflow