Version: v2.0

Private Deployment

For organizations that require data sovereignty, regulatory compliance, or operation in restricted or air-gapped environments.

note

# replaces: Modal / AWS Lambda + custom VPC config + separate compliance layer
# same agent definition deploys serverless or private; governance travels with it

Overview

aiXplain AgenticOS can be deployed entirely within your own infrastructure, with no dependency on external services at any point in the operating lifecycle. This includes environments with zero outbound connectivity — installation, updates, license validation, and all runtime operations are fully self-contained within your network.

Private deployment is available in two modes:

Mode	Description
On-prem	Full air-gapped deployment within your own infrastructure. Zero outbound connectivity. All data, models, logs, and telemetry remain on your premises. aiXplain staff have no access to your environment.
Edge	Customer-controlled cloud deployment. Only token counts and latency metrics are transmitted — no prompts, responses, or customer content leaves your environment.

For the managed cloud option, see the SaaS deployment documentation.

AIServices availability: Not all AIServices are included in private deployments. AssetOnboarding is available in all deployment modes. LLM Benchmarking and LLM Fine-tuning are available in SaaS and Edge deployments only.

Architecture

The platform deploys as containerized services using Docker and Docker Compose on Linux, organized across two modular units:

aiXPU (Processing Unit) hosts:

AgentEngine
AssetServing
AssetOnboarding
aiXplain Studio
aiXplain SDK
Authentication

aiXMU (Memory Unit) hosts:

RetrievalEngine (vector, graph, and SQL stores)
Caches
Logs, traces, and telemetry storage

Both units scale independently based on workload.

Scaling

Component	Scaling approach
aiXPU	Scale out multiple nodes behind a load balancer for concurrency and high availability
aiXMU	Expand storage and memory capacity; add optional data nodes
Load balancer	Optional; recommended when multiple aiXPU nodes are deployed

Air-gapped installation

The platform is designed for environments with zero outbound connectivity:

Installation is performed via offline bundles delivered as a signed package archive.
Updates are delivered as signed patch bundles with cryptographic integrity verification via checksum manifest (SHA-256).
Perpetual licenses are validated locally — no call-home required.
All observability data is retained on-prem. Export is permitted only through customer-approved offline processes.

Infrastructure prerequisites

The platform runs on Linux with Docker and Docker Compose. A local container registry is recommended to support offline bundle import and controlled upgrades. GPU nodes for model serving are customer-provisioned and registered into the platform via AssetOnboarding and the Compute Management Service.

Data handling in private deployments

All data, models, logs, traces, and telemetry remain within your network boundary.
Inference runs entirely in memory. Nothing is written to disk by default.
Prompts and responses are never used for model training.
The only opt-in exceptions are: embeddings stored when RAG is enabled, and agent session memory when explicitly configured. Both are governed by RBAC and model-scoped API key controls.
In on-prem environments, aiXplain staff have no access to your production environment under any circumstances.

Compliance

Standard	Coverage
SDAIA / NCA / GCC Data Protection	On-prem — data residency, access governance, content moderation, zero vendor data access
SOC 2 Type II	SaaS deployments only

In on-prem environments, aiXplain staff have no access to your production environment under any circumstances. All compliance-relevant controls — access governance, content moderation, PII redaction, and audit logging — are enforced entirely within your own infrastructure.

Compute Management Service

The Compute Management Service (CMS) is an optional add-on for MLOps teams that provides full node-level management of self-hosted LLM endpoints within your private deployment. It is separate from the core platform and intended for organizations managing their own model infrastructure at scale.

Capabilities:

Register and deregister model endpoints
Monitor node health and availability
Route inference requests across endpoints
Manage model lifecycle — versioning, replacement, and rollback

No external model API calls are required at any point. You retain full freedom to swap, upgrade, or replace models without dependency on any external provider.

Overview​

Architecture​

Scaling​

Air-gapped installation​

Infrastructure prerequisites​

Data handling in private deployments​

Compliance​

Compute Management Service​