Private deployment
For organizations that require data sovereignty, regulatory compliance, or operation in restricted or air-gapped environments.
Overview
aiXplain AgenticOS can be deployed entirely within your own infrastructure, with no dependency on external services at any point in the operating lifecycle. This includes environments with zero outbound connectivity — installation, updates, license validation, and all runtime operations are fully self-contained within your network.
Private deployment is available in two modes:
| Mode | Description |
|---|---|
| On-prem | Full air-gapped deployment within your own infrastructure. Zero outbound connectivity. All data, models, logs, and telemetry remain on your premises. aiXplain staff have no access to your environment. |
| Edge | Customer-controlled cloud deployment. Only token counts and latency metrics are transmitted — no prompts, responses, or customer content leaves your environment. |
For the managed cloud option, see the SaaS deployment documentation.
AIServices availability: Not all AIServices are included in private deployments. AssetOnboarding is available in all deployment modes. LLM Benchmarking and LLM Fine-tuning are available in SaaS and Edge deployments only.
Architecture
The platform deploys as containerized services using Docker and Docker Compose on Linux, organized across two modular units:
aiXPU (Processing Unit) hosts:
- AgentEngine
- AssetServing
- AssetOnboarding
- aiXplain Studio
- aiXplain SDK
- Authentication
aiXMU (Memory Unit) hosts:
- RetrievalEngine (vector, graph, and SQL stores)
- Caches
- Logs, traces, and telemetry storage
Both units scale independently based on workload.
Scaling
| Component | Scaling approach |
|---|---|
| aiXPU | Scale out multiple nodes behind a load balancer for concurrency and high availability |
| aiXMU | Expand storage and memory capacity; add optional data nodes |
| Load balancer | Optional; recommended when multiple aiXPU nodes are deployed |
Air-gapped installation
The platform is designed for environments with zero outbound connectivity:
- Installation is performed via offline bundles delivered as a signed package archive.
- Updates are delivered as signed patch bundles with cryptographic integrity verification via checksum manifest (SHA-256).
- Perpetual licenses are validated locally — no call-home required.
- All observability data is retained on-prem. Export is permitted only through customer-approved offline processes.
Infrastructure prerequisites
The platform runs on Linux with Docker and Docker Compose. A local container registry is recommended to support offline bundle import and controlled upgrades. GPU nodes for model serving are customer-provisioned and registered into the platform via AssetOnboarding and the Compute Management Service.
Data handling in private deployments
- All data, models, logs, traces, and telemetry remain within your network boundary.
- Inference runs entirely in memory. Nothing is written to disk by default.
- Prompts and responses are never used for model training.
- The only opt-in exceptions are: embeddings stored when RAG is enabled, and agent session memory when explicitly configured. Both are governed by RBAC and model-scoped API key controls.
- In on-prem environments, aiXplain staff have no access to your production environment under any circumstances.
Compliance
| Standard | Coverage |
|---|---|
| SDAIA / NCA / GCC Data Protection | On-prem — data residency, access governance, content moderation, zero vendor data access |
| SOC 2 Type II | SaaS deployments only |
In on-prem environments, aiXplain staff have no access to your production environment under any circumstances. All compliance-relevant controls — access governance, content moderation, PII redaction, and audit logging — are enforced entirely within your own infrastructure.
Compute Management Service
The Compute Management Service (CMS) is an optional add-on for MLOps teams that provides full node-level management of self-hosted LLM endpoints within your private deployment. It is separate from the core platform and intended for organizations managing their own model infrastructure at scale.
Capabilities:
- Register and deregister model endpoints
- Monitor node health and availability
- Route inference requests across endpoints
- Manage model lifecycle — versioning, replacement, and rollback
No external model API calls are required at any point. You retain full freedom to swap, upgrade, or replace models without dependency on any external provider.