Skip to main content
Model ID: qwen/qwen3-tts-customvoice | Parameters: 1.7B | Released: 2026-01-22

Overview

Qwen3-TTS-12Hz-1.7B-CustomVoice is a multilingual text-to-speech model from the Qwen3 family. With 1.7 billion parameters, it delivers high-quality speech synthesis across English, Chinese, Japanese, and Korean. The model features 9 preset voices and supports custom voice cloning, making it versatile for various applications. Operating at a 12Hz token rate, it provides efficient audio generation while maintaining natural-sounding output.

Air API Playground

Try the model in the playground.

Deploy with Container

Deploy with AIR Container.

API Usage Guide

Learn how to use the API.

Pricing

InputOutput
$0 / 1M tokens$0 / 1M tokens

Key Features

  • 1.7B parameter model with high-quality multilingual speech synthesis
  • Supports English, Chinese, Japanese, and Korean
  • 9 diverse preset voices with custom voice capability
  • 12Hz token rate for efficient audio generation
  • Built on Qwen3 architecture with strong language understanding

Use Cases

Narration Generation

Generate natural voice narration for video content and audiobooks. Input Text:
Life is like a box of chocolates. You never know what you’re gonna get.

Voice Announcements

Create voice announcements and notifications with various voice styles. Input Text:
Your order has been confirmed and will be delivered within 3 business days.

Conversational AI Voice

Generate natural voice responses for chatbots and virtual assistants. Input Text:
I’d be happy to help you with that! Let me check your account details.

Parameters

ParameterTypeRequiredDefaultDescription
inputstringRequired-Text to convert to speech
voiceenumOptional"serena"Voice preset
response_formatenumOptional"mp3"Output audio format
speednumberOptional1Speech speed multiplier (0.25-4.0)

Quick Start

1

Get your API key

Generate an API key from your AirCloud account.
2

Run the code

Replace YOUR_API_KEY with your actual key and choose your preferred language.
import requests

response = requests.post(
    "https://external.aieev.cloud:5007/ai/api/v1/audio/speech",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "model": "qwen/qwen3-tts-customvoice",
        "input": "Hello, welcome to AirCloud!",
        "voice": "serena",
        "response_format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

Tags

open-source tts 1.7B custom-voice multilingual multi-voice