Automatically dub any video with natural-sounding AI voices. Speaker detection, voice cloning, and lip sync — all through a single API endpoint.
{
"video_url": "https://example.com/video.mp4",
"source_language": "en",
"target_language": "es",
"voice_clone": true,
"lip_sync": true
}
// Response
{
"success": true,
"data": {
"task_id": "dub_abc123",
"status": "processing",
"estimated_time": 300
}
}
Fully automated dubbing pipeline — from speech detection to final output.
Provide a video URL with source and target language.
AI detects speakers, transcribes speech, and identifies timing.
AI generates dubbed audio matching each speaker's voice characteristics.
Get the dubbed video with synced audio and optional lip sync.
import requests, time
API_KEY = "ak_your_key_here"
BASE = "https://mobileapi.aienvoy.dev/api/v1"
headers = {"X-API-Key": API_KEY}
# 1. Start dubbing
resp = requests.post(f"{BASE}/dubbing",
headers=headers,
json={
"video_url": "https://example.com/video.mp4",
"source_language": "en",
"target_language": "es",
"voice_clone": True,
"lip_sync": True
}
)
task_id = resp.json()["data"]["task_id"]
# 2. Poll until done
while True:
status = requests.get(
f"{BASE}/dubbing/{task_id}/status",
headers=headers
).json()
if status["data"]["status"] == "completed":
print(status["data"]["result_url"])
break
time.sleep(15)
const API_KEY = "ak_your_key_here";
const BASE = "https://mobileapi.aienvoy.dev/api/v1";
const headers = {
"X-API-Key": API_KEY,
"Content-Type": "application/json"
};
// 1. Start dubbing
const res = await fetch(`${BASE}/dubbing`, {
method: "POST",
headers,
body: JSON.stringify({
video_url: "https://example.com/video.mp4",
source_language: "en",
target_language: "es",
voice_clone: true,
lip_sync: true
})
});
const { data } = await res.json();
// 2. Poll until done
const poll = setInterval(async () => {
const s = await fetch(
`${BASE}/dubbing/${data.task_id}/status`,
{ headers }
).then(r => r.json());
if (s.data.status === "completed") {
clearInterval(poll);
console.log(s.data.result_url);
}
}, 15000);
Dub your content for global audiences with native-quality AI voices.
AI replicates each speaker's unique voice characteristics in the target language.
Optional lip sync adjusts mouth movements to match the dubbed audio.
Automatically identifies and separates multiple speakers in the video.
Background music and sound effects are preserved while only speech is replaced.
Dubbed audio is time-aligned with the original, maintaining natural pacing.
Optionally generate subtitles in the target language alongside the dubbed video.
Localize YouTube, TikTok, and social media content
Dub training videos and courses for global teams
Dub films, series, and documentaries at scale
Internal communications for multinational teams
Get your free API key and dub your first video into any language today.