Documentation Index
Fetch the complete documentation index at: https://mintlify.com/yocxy2/chatterboxyocxy/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The SUPPORTED_LANGUAGES dictionary contains all language codes supported by the ChatterboxMultilingualTTS model. Use these language codes with the language_id parameter when generating speech.
Language Codes
ChatterboxMultilingualTTS supports 23 languages:
| Language Code | Language Name |
|---|
ar | Arabic |
da | Danish |
de | German |
el | Greek |
en | English |
es | Spanish |
fi | Finnish |
fr | French |
he | Hebrew |
hi | Hindi |
it | Italian |
ja | Japanese |
ko | Korean |
ms | Malay |
nl | Dutch |
no | Norwegian |
pl | Polish |
pt | Portuguese |
ru | Russian |
sv | Swedish |
sw | Swahili |
tr | Turkish |
zh | Chinese |
Usage
Import and use the SUPPORTED_LANGUAGES dictionary in your code:
from chatterbox import ChatterboxMultilingualTTS, SUPPORTED_LANGUAGES
# Print all supported languages
for code, name in SUPPORTED_LANGUAGES.items():
print(f"{code}: {name}")
# Check if a language is supported
if "fr" in SUPPORTED_LANGUAGES:
print(f"French is supported: {SUPPORTED_LANGUAGES['fr']}")
# Get supported languages from the class method
languages = ChatterboxMultilingualTTS.get_supported_languages()
print(languages)
# {'ar': 'Arabic', 'da': 'Danish', ...}
Using Language Codes
Pass the language code to the language_id parameter when generating speech:
import torchaudio
from chatterbox import ChatterboxMultilingualTTS
device = "cuda"
model = ChatterboxMultilingualTTS.from_pretrained(device)
# Generate speech in different languages
languages_to_test = {
"en": "Hello, how are you today?",
"es": "Hola, ¿cómo estás hoy?",
"fr": "Bonjour, comment allez-vous aujourd'hui?",
"de": "Hallo, wie geht es dir heute?",
"ja": "こんにちは、今日はお元気ですか?",
"zh": "你好,你今天好吗?",
}
for lang_code, text in languages_to_test.items():
audio = model.generate(
text=text,
language_id=lang_code,
audio_prompt_path="voice_sample.wav"
)
torchaudio.save(f"output_{lang_code}.wav", audio, model.sr)
Cross-Lingual Voice Cloning
You can clone a voice from one language and use it to synthesize speech in any other supported language:
from chatterbox import ChatterboxMultilingualTTS
import torchaudio
device = "cuda"
model = ChatterboxMultilingualTTS.from_pretrained(device)
# Clone an English voice
model.prepare_conditionals("english_speaker.wav")
# Use that voice to speak multiple languages
for lang_code in ["en", "fr", "es", "de", "ja"]:
if lang_code == "en":
text = "This is an English voice."
elif lang_code == "fr":
text = "C'est une voix anglaise qui parle français."
elif lang_code == "es":
text = "Esta es una voz inglesa hablando español."
elif lang_code == "de":
text = "Das ist eine englische Stimme, die Deutsch spricht."
elif lang_code == "ja":
text = "これは日本語を話す英語の声です。"
audio = model.generate(text=text, language_id=lang_code)
torchaudio.save(f"cross_lingual_{lang_code}.wav", audio, model.sr)
Validation
The model automatically validates language codes. If you provide an invalid language code, it will raise a ValueError:
try:
audio = model.generate(
text="Hello world",
language_id="invalid_code"
)
except ValueError as e:
print(e)
# ValueError: Unsupported language_id 'invalid_code'.
# Supported languages: ar, da, de, el, en, es, fi, fr, he, hi, it, ja, ko, ms, nl, no, pl, pt, ru, sv, sw, tr, zh
Notes
- Language codes are case-insensitive (“EN” and “en” both work)
- The model performs automatic text normalization for each language, including language-specific punctuation
- Cross-lingual voice cloning works best when the reference audio is clear and at least 5-10 seconds long
- Some languages may require specific fonts or Unicode support for proper text display