Instant Cloning

Learn here how to add a character and specify a voice sample for instant cloning

Through instant voice cloning, you only need to provide 5-30 seconds of any sample, and without any training of the model, cloning can be completed instantly. Our AI will instantly imitate the tone, speed, emotion, pauses, loudness, acoustic environment, breathing sounds, accent, vocalization and other characteristics of the cloned audio sample based on millions of hours of experience during generation, and try to understand the context of the target text as much as possible, and synthesize them to produce the most expressive and matching speech.

Currently, you can summon the character creation panel through the "Add Character" button on the Character Management page, or use the "Quick Create New Character..." button in the bottom left corner of the Speech Generation page to create a character.

You need to specify a name for the character you create, and optionally specify a description and an avatar. Currently, names, descriptions and avatars are for display only and do not affect voice cloning behavior.

Subsequently, you need to upload an audio file or record an audio as the default style guide sample for this cloning; this default style sample will be used to define the character's default voice performance, including voice line, emotion, speed, tone, prosody, etc. (you can add more different style samples later in the character details page).

After the audio upload is completed, click the Add button in the bottom right corner and wait for processing to complete.

Sample quality is more important than length. Noisy samples may produce poor results, so please provide high-quality sample audio whenever possible. Currently, sample audio length needs to be greater than 2 seconds and file size not exceeding 10M. You can use the voice separation/audio noise reduction/voice beautification/loudness standardization functions of CapCut PC version to easily obtain high-quality human voice audio samples from any audio; we will also soon provide such capabilities directly in our services.

For detailed considerations and best practices regarding instant cloning sample audio, please refer to this page.

Last updated

Was this helpful?