Product Updates - AirCloud

June 2026

Redesigned console, real-time notifications, and Persistent Volume

Redesigned console pagesWe’ve refreshed the console’s most-used pages for a cleaner, more consistent experience.

Container create: A rebuilt create flow — a clearer default mode, a card-based layout, and a configuration summary before you deploy. Start from a template with recommended items and an auto-filled name.
Organization overview: Redesigned organization page that surfaces status and usage at a glance.
Project overview: A reworked overview that brings your recent containers, recommended quick-start templates, and an Air API usage summary together on one page.
Consistent navigation: Cleaner, header-free pages and unified breadcrumb navigation throughout the console.

Real-time notificationsNotifications now arrive the moment something happens — no page refresh needed — and are collected in one place.

Instant delivery: Events such as deployment status changes and announcements appear in real time.
Notification center: Browse, page through, and dismiss past notifications from a single view.
Balance & usage alerts: Get notified about low balance, GPU usage charges, and payment failures so you can act before service is interrupted.

Persistent VolumePersistent volumes are now more flexible across your deployments.

Works with autoscaling: Attach a volume to deployments that use autoscaling and multiple replicas.
New or existing volume on create: Choose a fresh volume or reuse an existing one when creating a deployment.
Shared across deployments: A single volume can be attached to multiple deployments, and the picker shows which deployments are already using it.
Edit while stopped: Enable, disable, or swap the attached volume while a deployment is stopped.

Other improvements

Container tags: Organize and filter your containers with tags.
Replica quota at start: Replica quota is now checked when you start a deployment, with an in-line dialog to adjust replicas or request more quota.
Web Terminal replica selection: When a container has multiple replicas, choose which one to connect to, and the session auto-reconnects if a replica is replaced.
SSH key scoping: Apply a registered SSH key to specific containers, and enable or disable it without removing it.
Inline email verification: Enter a verification code directly during signup; disposable email domains are now blocked at registration.
Better Air API usage page: More accurate spend totals, consistent filters, and model display names across charts and records.
Persistent list views: Sorting, filters, search, and pagination are kept in the URL, so your view is restored when you navigate back from a detail page.
New brand color: Updated the primary brand color across the console.

April 2026

Web Terminal, AirCloud CLI, SSH access, and Shared Storage

AirCloud endpoint access and storage capabilities have been expanded. You can now connect to running containers from the browser or your local terminal, manage endpoints with the AirCloud CLI, use SSH-based development workflows, and share persistent storage across endpoints and replicas.Web TerminalBrowser-based terminal access is now available for custom container endpoints.

SSH-free terminal access: Open a terminal session through AirCloud’s exec-based connection without installing sshd or configuring SSH keys.
Custom container support: Custom container endpoints now show a Terminal button on the endpoint overview page.
Template environment access: AirCloud-provided templates such as Jupyter Notebook and Code already include terminal access through their web UI.
Replica selection: For endpoints with multiple replicas, select the replica you want to connect to. If only one replica is running, AirCloud connects automatically.
Simple session exit: Exit the terminal with exit or Ctrl-D. The terminal window closes automatically shortly after the session ends.
Connection retry guidance: If the terminal stays in the connecting state for a long time, try again from a new browser window or incognito mode.

AirCloud CLIThe new aircloud-cli lets you manage AirCloud endpoints and access running containers from your local terminal.

pip install aircloud-cli

Key capabilities:

CLI configuration: Configure the API base URL and API key with aircloud config.
Authentication context: Use aircloud whoami to verify your organization, project, user, and API key context.
Endpoint management: List, inspect, start, stop, scale, and patch endpoints from your terminal.
Replica inspection: View live replica status for active endpoints.
Endpoint logs: Retrieve endpoint logs, list log files, inspect per-replica logs, and fetch specific log ranges for debugging.
Exec-based shell access: Use aircloud exec <endpoint_id> to open an interactive shell inside a running container without SSH setup.
Replica pinning: Connect to a specific replica with aircloud exec <endpoint_id> -r <replica_id>.
Custom shell command: Start a different shell or command with aircloud exec <endpoint_id> -c "/bin/sh".

SSH access via AirCloud CLIAirCloud CLI now supports key-based SSH access into containers.

CLI-based SSH access: Use aircloud ssh <endpoint_id> to connect to supported containers from your local terminal.
SSH key injection: Register an SSH public key in the AirCloud console and inject it into selected endpoints.
Template support: AirCloud-provided Jupyter Notebook and Code templates support SSH access out of the box.
Custom container support: To use SSH with custom images, install sshd, add injected public keys to authorized_keys, and start the SSH server inside the container.
Standard SSH workflows: Use SSH-based workflows such as scp file transfer, ssh -L port forwarding, and VS Code Remote SSH.
Replica-specific SSH: Pin an SSH session to a specific replica with aircloud ssh <endpoint_id> -r <replica_id>.
Tunnel-only mode: Open only the SSH tunnel with aircloud ssh <endpoint_id> --tunnel-only for advanced workflows.

AirCloud Exec vs SSHAirCloud now supports two ways to access running containers, depending on your workflow.

Feature	`aircloud exec`	`aircloud ssh`
Image requirements	No image changes required	Custom images require `sshd`
Key setup	API key only	SSH key registration required
File transfer with `scp`	Not supported	Supported
Port forwarding with `ssh -L`	Not supported	Supported
VS Code Remote SSH	Not supported	Supported
Best for	Fast shell access and debugging	SSH-based development, file transfer, and port forwarding

Shared StorageShared Storage is now available for persistent volumes across endpoints and replicas.

One volume, multiple endpoints: Attach the same persistent volume to multiple endpoints within a project.
Shared storage across replicas: Multiple replicas in the same endpoint can mount and access the same volume.
Data independent from containers: Keep datasets, model checkpoints, logs, and output artifacts independent from the container lifecycle.
Flexible volume attachment: Create a new persistent volume when creating an endpoint, or attach an existing volume to another endpoint.
Read-write shared access: Shared volumes support read-write access, but applications should handle concurrent writes and file locking.

AirCloud CLI Reference

Learn how to install and use the AirCloud CLI.

Web Terminal and SSH Guide

Learn how to access running containers.

Shared Storage Guide

Learn how to share persistent volumes across endpoints and replicas.

Air API General Availability

Air API is now generally available. Access AirCloud’s AI models through an OpenAI-compatible interface with transparent per-token pricing.

OpenAI-compatible API: Use existing OpenAI SDKs and code to access AirCloud models with minimal changes.
Public pricing: Per-token pricing is now published for all models with pay-as-you-go billing.
Air API Playground: Test models directly in the browser without writing code.

Get started with Air API

Learn how to use Air API.

Browse Models

Explore all available models and pricing.

AirCloud Zero Release (RC)

AirCloud Zero is now available. Deploy Air Container and run AI inference workloads at a lower cost by leveraging crowdsourced GPU resources.

Lower-cost container deployment: Deploy Air Container at a lower cost than standard AirCloud by leveraging crowdsourced GPU resources.
Built for inference workloads: Best suited for AI inference workloads where cost efficiency and flexibility matter most.
Selectable at deployment time: Choose AirCloud Zero as the Cloud Type when creating an Air Container endpoint.

AirCloud Zero is currently in Release Candidate (RC). Some features and availability may be limited until General Availability (GA).

External API expansion and new model endpoints

The External API now covers full endpoint lifecycle management. Three new AI models are available on the platform.External API: New endpointsProgrammatic control over your deployments is now complete with five new endpoints:

List Endpoints: Retrieve a paginated list of all accessible endpoints with status and configuration details.
List Replicas: View live replica status for any active endpoint.
List/Get Log Files: Access endpoint log files and download their contents for debugging.
Patch Endpoint: Update runtime settings (replica count, scaling config) for inactive endpoints.
Get API Key Context: Verify your API key’s authentication scope and permissions.

API Reference

View the complete API documentation.

New models availableThree new models are now available through the Air API Playground:

Model	Type	Input	Output
Qwen3-TTS	Multilingual TTS with custom voice cloning	$0/1M tokens	$0/1M tokens
Qwen3.5-9B	Efficient 9B with vision support	$0.05/1M tokens	$0.15/1M tokens
Qwen3.5-35B-A3B	High-performance MoE (35B)	$0.1625/1M tokens	$1.3/1M tokens

Browse Models

Explore all available models and pricing.

February 2026

Jupyter Notebook, Web IDE, and Persistent Volume support

New development environments and storage capabilities for GPU workloads.Jupyter Notebook environmentGPU-powered Jupyter Notebook environments are now available as ready-to-use templates. Start developing immediately without any setup.

Pre-configured with TensorFlow, PyTorch, and other major libraries
Full GPU resource access for experimentation and model development
Browser-based access — no local installation required
Built-in Jupyter AI integration

Web IDE (Code) environmentA VS Code-based Web IDE is now available directly in the browser for cloud-native development.

Write and run code in the browser without a local development environment
Container-based isolated workspace for each session
Built-in development and debugging tools
Code assistant integration included

Persistent VolumeData now persists beyond container lifecycle with the new persistent volume feature.

Store model checkpoints, logs, and data files across container restarts
Data management independent of container lifecycle
Ideal for long-running jobs and iterative experiments

January 2026

External API initial release

Launched the AirCloud External API for programmatic endpoint management. Control your inference endpoints without leaving your terminal or CI/CD pipeline:

# Check endpoint status
curl -X GET https://external.aieev.cloud:5007/external/api/v1/endpoints/{id} \
  -H "Authorization: Bearer YOUR_API_KEY"

# Scale replicas
curl -X POST https://external.aieev.cloud:5007/external/api/v1/endpoints/{id}/scale \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"replica_count": 3}'

Key capabilities:

Get Endpoint Status: Check current status, configuration, and health of any endpoint.
Start/Stop Endpoint: Start or stop endpoints on demand via API.
Scale Replicas: Adjust replica count for active endpoints to match traffic requirements.

API Reference

Get started with the External API.

October 2025

Air API beta launch, observability, dashboards, and UX upgrades

Air API launches in beta alongside major improvements to observability, project dashboards, and security.Air API (beta)An OpenAI-compatible inference API is now available in beta. Access AirCloud’s AI models with a single API key.

OpenAI SDK compatible — swap models without changing your code
API key authentication with per-key endpoint access control
Test models interactively in the Playground

Observability & LogsRuntime logs are now searchable, filterable, and exportable:

Log download and bulk export: Download logs for offline analysis with single or batch export.
Advanced log filtering: Regex-based search and time-range filtering for faster troubleshooting.
Raw log (JSON) view: Inspect unprocessed log data alongside the structured view.
Scaling history: Track autoscale events, timeout-triggered changes, and error-driven replica adjustments over time.

DashboardsNew project-level and organization-level dashboards for usage visibility:

Project dashboards: Monitor status and usage at the project level with dedicated views.
Organization usage breakdown: View usage and cost distribution by project, endpoint, and instance type.
Extended request metrics: Cumulative request counts, job error rates, and other operational metrics.
Transaction history filtering: Filter billing records by type and status.

UX Improvements

Korean language support: Full Korean UI now available.
First-time user onboarding: Guided tutorial flow for new users to deploy their first endpoint.
In-product feedback: Submit feedback directly from within the platform.

Security & Access

OpenAI-compatible API keys: Issue API keys with OpenAI-compatible format. Control which endpoints each key can access.
HTTPS endpoints: Enforced HTTPS for all inference endpoints.
Path-based endpoint addressing: Moved from port-based to path-based routing with support for custom endpoint URLs.

September 2025

AirCloud General Availability

AirCloud is now generally available. With this release, Air Container graduates from beta to GA. Deploy and operate your own container images on AirCloud’s GPU infrastructure.Air Container GA

Deploy custom container images directly on GPU clusters
Autoscaling and scheduled scaling to match traffic patterns
Full control over runtimes, dependencies, and service configuration
Persistent storage integration for models and data

Get started with Air Container

Learn how to deploy containers.

Platform features

Time-based scaling: Schedule minimum replica counts or enable/disable autoscaling by time of day.
Custom endpoint URLs: Use your own URL identifiers instead of system-generated IDs for cleaner integration.
API Playground: Test and integrate APIs interactively through the built-in Playground without writing code.
One-click cluster deployment: Reduce repetitive setup with templated, automated cluster deployments.
Actionable error messages: Detailed error guidance on failure screens to cut mean-time-to-resolution.
Secure inference traffic (HTTPS): End-to-end HTTPS for all inference requests.

May 2025

Usage insights, autoscaling controls, and performance tuning

New visibility into costs, smarter autoscaling, and infrastructure-level performance hardening.

Usage & billing dashboards: Monthly and daily usage visualization to track credit consumption and spending trends at a glance.
Autoscaling sensitivity controls: Configure autoscale responsiveness (Heavy / Normal / Light) to match your workload’s characteristics—from bursty inference to steady throughput.
Deployment template reuse: Reuse existing deployment configurations to quickly spin up new endpoints with the same settings.
Predictable scaling behavior: Adjusted scale-in/out policies to reduce variability under varying load patterns.
Concurrent request handling improvements: Improved stability for large-file transfers and concurrent request handling under peak traffic.

April 2025

Zero-downtime operations

Stability improvements so platform updates and scaling events don’t disrupt running workloads.

Zero-downtime updates: Minimized service disruption during platform updates.
Request drop prevention: Improved handling during scale-in and updates to prevent in-flight request loss.
Restart stability: Event processing pipelines now handle component restarts without data loss or ordering issues.
HTTPS adoption: Began HTTPS rollout as the platform security baseline.

February 2025

Billing and team collaboration

Introduced credit-based billing and multi-user collaboration so teams can manage costs and share resources.

Credits-based billing: Manage usage costs through a prepaid credit system with top-up and balance tracking.
Team invites: Invite team members to your organization via email for collaborative access to shared resources.
Reserved capacity: Distinguish between reserved (term-based) and on-demand resources for predictable cost planning.
Billing automation: Automated usage collection and settlement processing on scheduled intervals.

January 2025

Air Container beta launch and platform foundation upgrades

Air Container beta is now available. Deploy custom container images on GPU infrastructure for the first time on AirCloud.Air Container (beta)

Deploy custom container images on GPU clusters
Basic autoscaling and monitoring support
Container health checks with automatic recovery

Infrastructure improvements

Custom metrics for user workloads: Collect and display per-container metrics (e.g., vLLM) with configurable metric targets.
Improved health checks: Better readiness detection for containers with long boot times to prevent premature termination.
Faster runtime environments: Lighter, faster packaging and deployment structures for reduced startup times.
Offline-friendly deployment: Support for restricted-network environments with offline operation scenarios.
Better model caching: More efficient HuggingFace model cache sharing and management for faster cold starts.
Reliable event capture: Hardened autoscale and operational event capture for more stable scaling decisions.

​Redesigned console, real-time notifications, and Persistent Volume

​Web Terminal, AirCloud CLI, SSH access, and Shared Storage

AirCloud CLI Reference

Web Terminal and SSH Guide

Shared Storage Guide

​Air API General Availability

Get started with Air API

Browse Models

​AirCloud Zero Release (RC)

​External API expansion and new model endpoints

API Reference

Browse Models

​Jupyter Notebook, Web IDE, and Persistent Volume support

​External API initial release

API Reference

​Air API beta launch, observability, dashboards, and UX upgrades

​AirCloud General Availability

Get started with Air Container

​Usage insights, autoscaling controls, and performance tuning

​Zero-downtime operations

​Billing and team collaboration

​Air Container beta launch and platform foundation upgrades

Redesigned console, real-time notifications, and Persistent Volume

Web Terminal, AirCloud CLI, SSH access, and Shared Storage

Air API General Availability

AirCloud Zero Release (RC)

External API expansion and new model endpoints

Jupyter Notebook, Web IDE, and Persistent Volume support

External API initial release

Air API beta launch, observability, dashboards, and UX upgrades

AirCloud General Availability

Usage insights, autoscaling controls, and performance tuning

Zero-downtime operations

Billing and team collaboration

Air Container beta launch and platform foundation upgrades