AI

    Agent Harness: the architecture that separates AI demos from real planning systems

    NPLAN Team
    2026
    11 min read

    From context analysis to operational execution

    Today any model can answer interesting questions about supply chain. You can ask an AI to analyze geopolitical risks that may affect your supply chain, identify possible logistics disruptions, suggest sourcing alternatives or summarize the impact of tariffs and international conflicts on a specific industry.

    That is useful. And, in many cases, impressive. But there is a huge difference between analyzing context and executing real operational planning.

    An LLM can discuss stockout risk for a specific segment. It cannot, on its own, reliably recalculate a synchronized plan for demand, supply, inventory, capacity and production of a real multi-level operation. Much less run a corporate MRP respecting finite capacity, BOM explosion, lead times, operational constraints, inventory policies, shelf life, plant synchronization and the financial impact of the plan.

    You cannot run MRP in ChatGPT.

    This is not a chat problem. It is an architecture problem. That is exactly where the Agent Harness concept comes in.

    Definition

    Agent Harness is the layer that turns a probabilistic LLM into a reliable corporate system.

    From junior to senior

    Pure LLM

    Behaves like a junior supply chain analyst. Reasons well in the abstract, but confidently makes rookie mistakes when the question touches finite capacity, actual BOM structures, lead times or financial impact.

    LLM + Agent Harness

    Behaves like a senior planner. Guardrails block the inexperienced mistakes. Reliable, validated tools handle the heavy and critical math. The model interprets, explains and orchestrates, but does not invent numbers.

    Mistaking AI for a chat interface

    A large part of the market still treats AI as a sophisticated chatbot. You ask a question. The model answers. End of story.

    That works for personal productivity. It does not work for Supply Chain Planning.

    Industrial planning involves hard constraints, deterministic computations, complex business rules, measurable financial impact, multiple horizons and dependencies between modules. An AI agent cannot simply "invent" a production, inventory or capacity recommendation. It has to operate inside a reliable system.

    What an Agent Harness is

    The harness is the layer that surrounds the LLM. It is the set of structures, rules, tools, memory, validations and execution mechanisms that turn a probabilistic model into a reliable corporate operating system.

    The LLM alone does not solve the problem. It is one piece inside the flow. Everything around it, allowed tools, accessible data, business rules, validations, orchestration across agents, persistent memory and governance, is defined by the harness.

    The same model produces completely different results depending on the harness built around it.

    Purpose use

    Supply Chain Planning Agent

    Trained with real data in a safe way, surrounded by guardrails, equipped with powerful and validated tools for heavy compute, and delivering enriched output.

    Real-data training

    Structural, obfuscated samples, never the raw customer database

    Guardrails

    Input, compliance, rate limit, output sanitization

    Reliable tools

    MRP / APS engine handles the critical math

    Enriched output

    Interactive charts, planning grids, next steps

    The harness architecture at NPLAN

    In the context of Supply Chain Planning, the harness is five layers working together. The diagram below shows how a user request flows down through validation, orchestration and execution, then back up as a validated agent response.

    01

    Guardrails & Gateway

    Corporate firewall around every prompt and every response

    Input validation

    PII / personal & private data

    Compliance filter

    Configurable rules

    Rate limiting

    Usage control

    Output sanitization

    Post-LLM check

    filtered context
    02

    Orchestration

    One orchestrator coordinates multiple domain-specialist agents

    Orchestrator + Error Recovery

    Routes intent, sequences agents, recovers from failures

    Demand Agent
    Inventory Agent
    Capacity Agent
    Supply Agent
    Finance Agent
    Custom Agents

    Multiple custom agents can be added per operation

    tool call → sampled data
    03

    Tool & Engine Integration

    Structural, obfuscated samples travel to the LLM. The full database stays local. Everything auditable.

    Supply chain engine

    MRP / APS local compute

    Query sandbox

    Local execution, real data

    Rich output

    HTML charts, interactive grids

    Next-step suggestions

    Conversation continuity

    enriched context
    04

    Memory & Context

    Specialized context engineering for supply chain

    Short-term

    Active conversation

    Mid-term

    Plan parameters

    Long-term

    Policies, history

    trace emitted
    05

    Observability & Governance

    Every answer validated before display. Feedback feeds learning.

    Query audit

    Anti-hallucination

    Decision logs

    Full traceability

    Response check

    Pre-display validation

    Feedback loop

    Becomes improvement

    Continuous learning loop

    1. Guardrails & Gateway

    The first layer protects the organization before any prompt ever reaches the model.

    Input validation. Every request goes through a filter the company configures. The customer defines what cannot be sent to the LLM: tax IDs, personal identifiers, sensitive commercial data, intellectual property, anything covered by compliance. The filter is mostly deterministic, with explicit auditable rules, and can layer in lightweight semantic validation for ambiguous cases.

    Compliance filter. Business rules that go beyond the technical layer. A pharmaceutical company can block formula data. A multi-country operation can restrict cross-plant data flows. The system honors those boundaries automatically.

    Rate limiting. Usage control by user, team or module. Governs token consumption and prevents misuse.

    Output sanitization. Validation does not stop at the input. Before any response reaches the user, it goes through a sanitization layer that checks consistency, removes unwanted artifacts and applies the same compliance rules in the opposite direction.

    2. Orchestration

    The orchestrator is the heart of the harness. It receives the user intent and decides which specialist agent to trigger, in which sequence, with which context.

    At NPLAN there is no single generic agent. There is a coordinated network of agents specialized by domain: a demand agent, an inventory agent, a capacity agent, a supply agent, a finance agent. Each one operates with context limited to its domain, specific tools, clear objectives and its own rules.

    The customer can configure the personality of each agent, adjust its behavior parameters or create fully custom agents for specific operational needs.

    The orchestrator also handles error recovery. When an agent returns an inconsistent result, the system does not propagate the error: it tries to correct it, escalates to human review or informs the user clearly.

    3. Tool & Engine integration

    This layer solves a problem most AI solutions ignore: how to combine probabilistic reasoning with deterministic computation that has to be right the first time.

    LLMs should not execute planning.

    They should interpret, explain, orchestrate and operate on top of deterministic engines.

    LLM

    interpretsexplainsorchestrates

    SCP Engine

    computesvalidatessynchronizesoptimizes

    The full database never leaves the environment. In most interactions the agent receives only structural, obfuscated samples of the tables: schemas, representative examples and distributions, enough for the model to reason. When an answer requires a specific slice, part of that slice may travel to the LLM, always within the customer compliance boundaries. Even then, the full database stays local and every call is logged and auditable. It is a radically safer model than sending the whole database to the AI.

    The supply chain engine. MRP, BOM explosion, finite capacity and inventory optimization run in a specialized math engine, deterministic and reproducible. The LLM does not compute: it triggers the engine and interprets the result.

    Rich output. Agent answers are not just text. The harness turns results into HTML charts configured dynamically per context, planning grid widgets for structured data, and analysis-specific formats.

    Next-step suggestions. At the end of every interaction the system proactively suggests the next relevant steps, keeping the conversation productive and contextualized.

    4. Memory & Context

    LLMs have no memory across calls. The harness has to provide it intelligently.

    Memory operates on three horizons: short term (the active conversation), mid term (parameters of the active planning scenario, horizon, policies, shop floor constraints, operational calendar, frozen periods) and long term (consolidated inventory policies, historical behavior, configured preferences).

    The agent does not just receive conversation history. It receives live operational context: active scenario, frozen periods, capacity constraints, inventory policies, planning horizon, financial targets and operational priorities.

    Specialized context engineering is what separates a generic agent from a supply chain agent. Sending a few lines in a prompt is not enough. The harness injects scenario parameters, strategic priorities, service targets, per-plant rules and financial indicators. That changes the quality of the decisions completely.

    5. Observability & Governance

    The more autonomy the agents gain, the more critical governance becomes.

    Response validation. Every answer is validated before it reaches the user. The system checks internal consistency, verifies that the numbers make sense within plan constraints and applies deterministic verification loops. The agent does not just answer, it self-corrects continuously.

    Query audit. Any reasoning that produced an answer can be inspected. If a hallucination is suspected, the analyst can audit the exact query that ran, the data used and the logic applied. Real traceability, not just a chat log.

    Decision logs. Everything sent to the LLM is recorded. Decision history, versioning, full explainability.

    Learning feedback loop. Every answer prompts the user to rate it as positive or negative. When negative, the feedback does not vanish into a forgotten dashboard, it feeds the harness improvement process. The system learns from mistakes in a structured way.

    Why generic workflows do not solve it

    Horizontal platforms are commoditizing the basics fast: OAuth connectors, SaaS integrations, generic skills, agents that run administrative tasks.

    But Supply Chain Planning is not a generic problem.

    A pre-built workflow does not understand finite capacity, segment-based inventory policy, multi-level planning, BOM explosion, shop floor constraints, shelf life, variable lead time, plant synchronization or the financial impact of the plan.

    It is impossible to solve this only with prompts.

    The future is harness native

    Many people are still debating which model is best. That debate loses weight quickly because models are converging.

    The real competitive edge is migrating to architecture, harness, context, governance, orchestration, integration with real math engines and vertical specialization.

    That is exactly what separates an AI demo from a corporate operating system based on AI.

    The future of planning will not be a chatbot screen replacing planners. It will be an intelligent layer working alongside specialized math engines, where planners still exist, but with much faster simulation, contextualized recommendations, automated exception analysis and domain-specific copilots.

    AI stops being an interface. It becomes operational decision infrastructure.

    Continue the series

    AI foundations at NPLAN

    Understand the technical foundation behind this architecture: AI Agents, Supply Chain Engine and Supply Chain Data working together.

    Read the next article