> For the complete documentation index, see [llms.txt](https://docs.vocu.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vocu.ai/voices/create.md). # Instant Cloning Through instant voice cloning, you only need to provide a 5-30 second sample, and without any model training, cloning can be completed instantly. Our AI will instantly, based on millions of hours of experience, try to mimic the **tone, speed, emotion, pauses, loudness, acoustic environment, breathing sounds, accent, and vocalization** characteristics of the cloned audio sample during generation, understand the context of the target text as much as possible, and synthesize them to produce the most expressive and matching speech. Currently, you can summon the character creation panel by clicking the **"Add Character"** button on the [Voice Management](https://app.vocu.ai/voices) page, or by selecting the **"Create New Character..."** button in the popup when selecting a character on the [Vocu Studio](https://app.vocu.ai/generate) page, and use it to create a character. The first step is to select the type of creation. Different types of characters have slightly different performance details. The availability of model versions and types will be opened according to our current maintenance plan.

Then, you need to upload an audio file or record a piece of audio as the default style guide sample for this cloning. This default style sample **will be used to define the default voice performance of the character, including voice line, emotion, speed, tone, prosody, etc. (later you can add more different style samples in the character details page).**

> **We have added a simple audio processing function to the audio uploader, which allows you to quickly edit the audio clips to be uploaded.**

After the audio is uploaded, please confirm whether your uploaded voice sample belongs to the language range supported by the model. The system will automatically recognize the supported language in the audio. If you need more accurate recognition results, you can also manually select the language type for better results (Cantonese samples need to be manually selected). If the audio sample has background sound, you can also turn on the **"Remove Background Sound"** switch, and the system will optimize the audio sample when creating the character.

Then, you need to specify a name for the created character, and optionally specify a description and an avatar. Currently, the name, description and avatar are for display only and will not affect the usage effect. Then confirm the relevant information of this creation on the final page, **click** the submit button in the lower right corner and wait for processing to complete. {% hint style="info" %} Sample quality is more important than length. Noisy samples may produce poor results. Please provide high-quality sample speech as much as possible. Currently, sample speech length needs to be **greater than 2 seconds** and **file size not exceeding 20M**. You can also try to obtain high-quality vocal audio samples from any audio using **vocal separation/audio noise reduction/vocal beautification/loudness normalization** and other functions of some **audio editing software**. {% endhint %} {% hint style="warning" %} V2 series models **(V2.9) only support Chinese and English**. When using V2 series models, please ensure that the input text does not contain any non-Chinese and English characters, such as Japanese and Korean, otherwise it may cause **generation failure** and other issues. Starting from V3 series, we have added Cantonese, Japanese, Korean, French, German, Spanish and Portuguese in addition to Chinese and English, as well as more than 30 accent variants of these languages in total. Please ensure that the model version and text content you use are in line with the corresponding support capabilities. {% endhint %} For detailed precautions and best practices about instant cloning sample audio, please [refer to this page](/voices/tips.md). --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.vocu.ai/voices/create.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.