Deploy ResembleAI Chatterbox One-Click App

Deploy Chatterbox text to speech (TTS) model on Koyeb’s high-performance cloud infrastructure.

With one click, get a dedicated GPU-powered inference endpoint ready to handle requests with built-in autoscaling and scale-to-zero.

Deploy Chatterbox for free

Get up to $200 in credit to get started!

Claim credit

Overview of Chatterbox

Chatterbox is a production-grade, open-source text-to-speech (TTS) model that has been evaluated alongside top proprietary systems like ElevenLabs, consistently earning higher preference in direct comparisons.

The default GPU for running this model is the Nvidia L40S instance type. You are free to adjust the GPU instance type to fit your workload requirements.

Quickstart

After you deploy the Chatterbox model, copy the Koyeb App public URL similar to https://<YOUR_DOMAIN_PREFIX>.koyeb.app and create a simple Python file with the following content to start interacting with the model.

import base64

import httpx

KOYEB_URL = "https://<YOUR_DOMAIN_PREFIX>.koyeb.app"


def wav_to_base64(file_path):
    """
    Convert a WAV file to a Base64 string.
    :param file_path: Path to the WAV file
    :return: Base64 encoded audio string
    """
    with open(file_path, "rb") as wav_file:
        binary_data = wav_file.read()
        base64_data = base64.b64encode(binary_data)
        base64_string = base64_data.decode("utf-8")
        return base64_string


def b64_to_wav(base64_string, output_file_path):
    """
    Convert a Base64 string to a WAV file.
    :param base64_string: Base64 encoded audio string
    :return: WAV file object
    """
    # Remove the header if present
    if base64_string.startswith("data:audio"):
        base64_string = base64_string.split(",")[1]

    # Decode the Base64 string
    binary_data = base64.b64decode(base64_string)

    with open(output_file_path, "wb") as wav_file:
        wav_file.write(binary_data)


voice_base64 = wav_to_base64("./voice.wav")
print
payload = {
    "audio_prompt_b64": voice_base64,
    "cfgw_input": 0.2,
    "exaggeration_input": 0.75,
    "temperature_input": 0.8,
    "text_input": "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill.",
}

# Call the model precition endpoint
res = httpx.post(
    f"{KOYEB_URL}/predict",
    json=payload,
    timeout=60.0,
)

# Get the output audio
res = res.json()
output = res.get("audio")

# Convert the base64 model output to a wav file and save it to disk
b64_to_wav(output, "output.wav")

The snippet above showcases how to interact with the Chatterbox model to generate an audio from a text prompt and save it to disk.

Take care to replace the KOYEB_URL value in the snippet with your Koyeb App public URL.

Executing the Python script generate an audio and save it to disk.

python main.py

ResembleAI Chatterbox

Overview of Chatterbox

Quickstart

Deploy AI apps to production in minutes