Supplier AI Agent Platform

Runtime & Security Requirements

Audience: AWS Solutions Architecture team

Objective: Provide enough technical context to propose an architecture pattern for our agent runtime before our working session.

What We're Building

We are building a multi-tenant AI agent that acts on behalf of suppliers on the Coupa Supplier Portal (CSP). The agent receives natural language input, loads procedural skill definitions based on intent, calls internal API endpoints via wrapped CLI tools, and returns results to the supplier. One agent architecture, many capabilities; not separate agentic workflows per use case.

Execution Model

A supplier initiates a session through an arbitrary channel (web app, email, etc). The agent:

Authenticates within the supplier's tenant context (identity, permissions, data scope)

Interprets intent and loads the relevant skill — a plain-text instruction set that governs which tools the agent can call and how

Executes by calling one or more wrapped API endpoints (orders, invoices, profile, catalog) in a tool-use loop, reasoning over results between calls

Returns output to the supplier and the session ends or continues

Sessions range from short (single question, 2–3 tool calls, seconds) to long-running (bulk catalog enrichment, dozens of tool calls, minutes). The agent must support both without architectural divergence.

Technical Requirements

Requirement	Detail
Tenant isolation	Strict per-supplier data boundaries. A session must only access data belonging to the authenticated supplier's tenant. Isolation must be enforced at the infrastructure level, not the prompt level. No cross-tenant data leakage under any execution path.
Concurrency	The platform must support many concurrent agent sessions across different supplier tenants. PoC scale: tens of concurrent sessions. Production scale: thousands. Sessions are independent, no shared state between them.
Auditability	Every tool call, API request, model input, and model output must be logged with full fidelity. Logs must support replay — given a session ID, we can reconstruct exactly what the agent did and why. Sessions must be interruptible by the supplier or by system policy at any point in execution.
Tool execution & network access	The agent calls wrapped internal API endpoints as its tools. The runtime must allow outbound network access to these internal services while restricting access to anything outside the authorized API surface. The tool set is bounded and enumerated per skill — the agent cannot call arbitrary endpoints.

What We Need From You

Proposed architecture pattern for the agent runtime — how sessions are provisioned, isolated, and torn down

Runtime recommendation — what AWS services or patterns best fit this execution model

What you need from us to scope and stand up a PoC