From context analysis to operational execution
Today any model can answer interesting questions about supply chain. You can ask an AI to analyze geopolitical risks that may affect your supply chain, identify possible logistics disruptions, suggest sourcing alternatives or summarize the impact of tariffs and international conflicts on a specific industry.
That is useful. And, in many cases, impressive. But there is a huge difference between analyzing context and executing real operational planning.
An LLM can discuss stockout risk for a specific segment. It cannot, on its own, reliably recalculate a synchronized plan for demand, supply, inventory, capacity and production of a real multi-level operation. Much less run a corporate MRP respecting finite capacity, BOM explosion, lead times, operational constraints, inventory policies, shelf life, plant synchronization and the financial impact of the plan.
You cannot run MRP in ChatGPT.
This is not a chat problem. It is an architecture problem. That is exactly where the Agent Harness concept comes in.
Agent Harness is the layer that turns a probabilistic LLM into a reliable corporate system.
Pure LLM
Behaves like a junior supply chain analyst. Reasons well in the abstract, but confidently makes rookie mistakes when the question touches finite capacity, actual BOM structures, lead times or financial impact.
LLM + Agent Harness
Behaves like a senior planner. Guardrails block the inexperienced mistakes. Reliable, validated tools handle the heavy and critical math. The model interprets, explains and orchestrates, but does not invent numbers.
Mistaking AI for a chat interface
A large part of the market still treats AI as a sophisticated chatbot. You ask a question. The model answers. End of story.
That works for personal productivity. It does not work for Supply Chain Planning.
Industrial planning involves hard constraints, deterministic computations, complex business rules, measurable financial impact, multiple horizons and dependencies between modules. An AI agent cannot simply "invent" a production, inventory or capacity recommendation. It has to operate inside a reliable system.
What an Agent Harness is
The harness is the layer that surrounds the LLM. It is the set of structures, rules, tools, memory, validations and execution mechanisms that turn a probabilistic model into a reliable corporate operating system.
The LLM alone does not solve the problem. It is one piece inside the flow. Everything around it, allowed tools, accessible data, business rules, validations, orchestration across agents, persistent memory and governance, is defined by the harness.
The same model produces completely different results depending on the harness built around it.
Supply Chain Planning Agent
Trained with real data in a safe way, surrounded by guardrails, equipped with powerful and validated tools for heavy compute, and delivering enriched output.
Real-data training
Structural, obfuscated samples, never the raw customer database
Guardrails
Input, compliance, rate limit, output sanitization
Reliable tools
MRP / APS engine handles the critical math
Enriched output
Interactive charts, planning grids, next steps
The harness architecture at NPLAN
In the context of Supply Chain Planning, the harness is five layers working together. The diagram below shows how a user request flows down through validation, orchestration and execution, then back up as a validated agent response.
Guardrails & Gateway
Corporate firewall around every prompt and every response
Input validation
PII / personal & private data
Compliance filter
Configurable rules
Rate limiting
Usage control
Output sanitization
Post-LLM check
Orchestration
One orchestrator coordinates multiple domain-specialist agents
Orchestrator + Error Recovery
Routes intent, sequences agents, recovers from failures
Multiple custom agents can be added per operation
Tool & Engine Integration
Structural, obfuscated samples travel to the LLM. The full database stays local. Everything auditable.
Supply chain engine
MRP / APS local compute
Query sandbox
Local execution, real data
Rich output
HTML charts, interactive grids
Next-step suggestions
Conversation continuity
Memory & Context
Specialized context engineering for supply chain
Short-term
Active conversation
Mid-term
Plan parameters
Long-term
Policies, history
Observability & Governance
Every answer validated before display. Feedback feeds learning.
Query audit
Anti-hallucination
Decision logs
Full traceability
Response check
Pre-display validation
Feedback loop
Becomes improvement
Continuous learning loop
1. Guardrails & Gateway
The first layer protects the organization before any prompt ever reaches the model.
Input validation. Every request goes through a filter the company configures. The customer defines what cannot be sent to the LLM: tax IDs, personal identifiers, sensitive commercial data, intellectual property, anything covered by compliance. The filter is mostly deterministic, with explicit auditable rules, and can layer in lightweight semantic validation for ambiguous cases.
Compliance filter. Business rules that go beyond the technical layer. A pharmaceutical company can block formula data. A multi-country operation can restrict cross-plant data flows. The system honors those boundaries automatically.
Rate limiting. Usage control by user, team or module. Governs token consumption and prevents misuse.
Output sanitization. Validation does not stop at the input. Before any response reaches the user, it goes through a sanitization layer that checks consistency, removes unwanted artifacts and applies the same compliance rules in the opposite direction.
2. Orchestration
The orchestrator is the heart of the harness. It receives the user intent and decides which specialist agent to trigger, in which sequence, with which context.
At NPLAN there is no single generic agent. There is a coordinated network of agents specialized by domain: a demand agent, an inventory agent, a capacity agent, a supply agent, a finance agent. Each one operates with context limited to its domain, specific tools, clear objectives and its own rules.
The customer can configure the personality of each agent, adjust its behavior parameters or create fully custom agents for specific operational needs.
The orchestrator also handles error recovery. When an agent returns an inconsistent result, the system does not propagate the error: it tries to correct it, escalates to human review or informs the user clearly.
3. Tool & Engine integration
This layer solves a problem most AI solutions ignore: how to combine probabilistic reasoning with deterministic computation that has to be right the first time.
LLMs should not execute planning.
They should interpret, explain, orchestrate and operate on top of deterministic engines.
LLM
SCP Engine
The full database never leaves the environment. In most interactions the agent receives only structural, obfuscated samples of the tables: schemas, representative examples and distributions, enough for the model to reason. When an answer requires a specific slice, part of that slice may travel to the LLM, always within the customer compliance boundaries. Even then, the full database stays local and every call is logged and auditable. It is a radically safer model than sending the whole database to the AI.
The supply chain engine. MRP, BOM explosion, finite capacity and inventory optimization run in a specialized math engine, deterministic and reproducible. The LLM does not compute: it triggers the engine and interprets the result.
Rich output. Agent answers are not just text. The harness turns results into HTML charts configured dynamically per context, planning grid widgets for structured data, and analysis-specific formats.
Next-step suggestions. At the end of every interaction the system proactively suggests the next relevant steps, keeping the conversation productive and contextualized.
4. Memory & Context
LLMs have no memory across calls. The harness has to provide it intelligently.
Memory operates on three horizons: short term (the active conversation), mid term (parameters of the active planning scenario, horizon, policies, shop floor constraints, operational calendar, frozen periods) and long term (consolidated inventory policies, historical behavior, configured preferences).
The agent does not just receive conversation history. It receives live operational context: active scenario, frozen periods, capacity constraints, inventory policies, planning horizon, financial targets and operational priorities.
Specialized context engineering is what separates a generic agent from a supply chain agent. Sending a few lines in a prompt is not enough. The harness injects scenario parameters, strategic priorities, service targets, per-plant rules and financial indicators. That changes the quality of the decisions completely.
5. Observability & Governance
The more autonomy the agents gain, the more critical governance becomes.
Response validation. Every answer is validated before it reaches the user. The system checks internal consistency, verifies that the numbers make sense within plan constraints and applies deterministic verification loops. The agent does not just answer, it self-corrects continuously.
Query audit. Any reasoning that produced an answer can be inspected. If a hallucination is suspected, the analyst can audit the exact query that ran, the data used and the logic applied. Real traceability, not just a chat log.
Decision logs. Everything sent to the LLM is recorded. Decision history, versioning, full explainability.
Learning feedback loop. Every answer prompts the user to rate it as positive or negative. When negative, the feedback does not vanish into a forgotten dashboard, it feeds the harness improvement process. The system learns from mistakes in a structured way.
Why generic workflows do not solve it
Horizontal platforms are commoditizing the basics fast: OAuth connectors, SaaS integrations, generic skills, agents that run administrative tasks.
But Supply Chain Planning is not a generic problem.
A pre-built workflow does not understand finite capacity, segment-based inventory policy, multi-level planning, BOM explosion, shop floor constraints, shelf life, variable lead time, plant synchronization or the financial impact of the plan.
It is impossible to solve this only with prompts.
The future is harness native
Many people are still debating which model is best. That debate loses weight quickly because models are converging.
The real competitive edge is migrating to architecture, harness, context, governance, orchestration, integration with real math engines and vertical specialization.
That is exactly what separates an AI demo from a corporate operating system based on AI.
The future of planning will not be a chatbot screen replacing planners. It will be an intelligent layer working alongside specialized math engines, where planners still exist, but with much faster simulation, contextualized recommendations, automated exception analysis and domain-specific copilots.
AI stops being an interface. It becomes operational decision infrastructure.
AI foundations at NPLAN
Understand the technical foundation behind this architecture: AI Agents, Supply Chain Engine and Supply Chain Data working together.
Read the next article