> For the complete documentation index, see [llms.txt](https://docs.vocu.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vocu.ai/introduction/quick-start.md).

# Quick Start

### Account Registration <a href="#account" id="account"></a>

#### Vocu Account

You only need **your email** and **your specified password** to log in or register and start using all of VOCU's services. When you visit the [VOCU console](https://app.vocu.ai) without logging in, you will see input boxes that allow you to log in with your email and password. If you haven't registered yet, simply click the **registration button at the bottom of the page** to begin the registration process.

#### Third-Party Accounts

In addition to email login and registration, VOCU also integrates common platform accounts, as well as third-party login methods. When you register through a third-party platform account, we will obtain your account's email or ID after you log in and use it as your unique credential in VOCU to create an account.

#### Registration Rewards

After you log in to VOCU for the first time through any method, you will receive **1,500 points** for free to experience our various services. You can receive more points for free through daily check-ins, or purchase more points in bulk through payment.

### Create Character <a href="#create-character" id="create-character"></a>

[View Detailed Introduction](/voices/create.md)

In addition to using community characters, you can also create a character on the [Voice Management](https://app.vocu.ai/voices) page, upload or record audio samples for it, and then use this custom character for speech generation.

You can also summon the character creation panel by selecting the "Create New Character..." button in the popup when selecting a character on the [Vocu Studio](https://app.vocu.ai/generate) page, and use it to create a character.

{% hint style="info" %}
Sample quality is more important than length. Noisy samples may produce poor results. Please provide high-quality sample speech as much as possible. Currently, sample speech length needs to be **greater than 2 seconds** and **file size not exceeding 20M**. You can also try to obtain high-quality vocal audio samples from any audio using **vocal separation/audio noise reduction/vocal beautification/loudness normalization** and other functions of some **audio editing software**.
{% endhint %}

{% hint style="warning" %}
V2 series models **(V2.9) only support Chinese and English**. When using V2 series models, please ensure that the input text does not contain any non-Chinese and English characters, such as Japanese and Korean, otherwise it may cause **generation failure** and other issues.

Starting from V3 series, we have added Cantonese, Japanese, Korean, French, German, Spanish and Portuguese in addition to Chinese and English, as well as more than 30 accent variants of these languages in total. Please ensure that the model version and text content you use are in line with the corresponding support capabilities.
{% endhint %}

### Speech Generation <a href="#generate" id="generate"></a>

After you have your first character, you can start generating your first speech on the [Dubbing Studio](https://app.vocu.ai/generate) page.

#### Text Content Editing <a href="#text-edit" id="text-edit"></a>

[View Detailed Introduction](/generate/text-edit.md)

You only need to assign a character and enter any text in the text input box on the page to start speech generation. If you need to optimize multiple paragraphs of text, you can click the **Add Paragraph** button to add more, or you can paste the content you need through **Auto-paragraph Addition**. After clicking Add, the system will automatically **split** your text and add it to the list above one by one.

After that, you can also edit each item in the list individually, assign characters, adjust order, delete content, insert downward, and other operations.

{% hint style="warning" %}
Currently, each paragraph can enter up to 1,200 characters. Since each modification of the content of a single paragraph requires regeneration of the corresponding paragraph, if you may need to adjust frequently during use, we recommend that you control the content length of a single paragraph to 50 to 100 words, and keep it as a single or several complete coherent sentences without extra line breaks or extra spaces. This can facilitate optimization of audio details and subsequent editing.
{% endhint %}

{% hint style="danger" %}
V2 series models **(V2.9) only support Chinese and English**. When using V2 series models, please ensure that the input text does not contain any non-Chinese and English characters, such as Japanese and Korean, otherwise it may cause **generation failure** and other issues.

Starting from V3 series, we have added Cantonese, Japanese, Korean, French, German, Spanish and Portuguese in addition to Chinese and English, as well as more than 30 accent variants of these languages in total. Please ensure that the model version and text content you use are in line with the corresponding support capabilities.
{% endhint %}

#### Generation Parameter Configuration <a href="#gen-config" id="gen-config"></a>

[View Detailed Introduction](/generate/config.md)

After completing content editing, you can adjust generation configuration at the gear icon of the paragraph (below each paragraph) on the page.

**Generation presets** have a great impact on the final generation effect. We have pre-configured relatively balanced presets as default settings. If you need to adjust, you can switch between different presets to find the most suitable effect for you.

{% hint style="info" %}
In many cases, you can first try using the default parameter configuration for generation, and then consider adjusting if it does not achieve your expected effect.
{% endhint %}

<details>

<summary>Manually Adjustable Advanced Generation Settings</summary>

* **Presets:** Used to control the performance strategy adopted by the voice during generation, determining the basic expressiveness of the voice in hearing. Different generation presets focus on different parameter directions and can determine the understanding and expressiveness between the output voice and the text (for example, using the balanced preset, the voice will balance pronunciation performance and text understanding performance, fit the understanding of content, while using the creative preset, the voice will show a more performance-oriented pronunciation method based on the context of the text, and will have relatively unique effects when dealing with different scenarios.)
* **Emotion Style:** For the content input during generation, different style tendencies will optimize and restore different detail parts during processing. When selecting text-oriented, the result details will be improved according to the semantics of the input text, more in line with the text context; when selecting character-oriented, more attention will be paid to restoring the direct expressiveness of the voice character sample.
* **Generation Seed**: Controls the randomness during generation. The same seed will produce similar results during generation. This value can be an integer from 1 to 2147483647. The default setting of -1 is completely random. Usually no adjustment is needed.
* **Speech Rate**: Controls the speed of generated speech. The larger the value, the faster the speech speed. It can be adjusted to a value between 0.5x and 2x, with 1 being normal speed.

</details>

<details>

<summary>Unique Configuration Added in V3.0 Series Models</summary>

**In the V3.0 series model, we have added some unique parameters. Paragraphs assigned with V3.0 model characters can additionally control the following settings:**

* **Language:** Specify the language of the content in the paragraph. By default, the system can automatically recognize the language of the input content. If the recognition is not accurate enough or the language you input is Cantonese, please manually select the language. (Currently cannot automatically recognize Cantonese content)
* **Vivid Expression:** Support enabling the "Vivid Expression" option for individual paragraphs. When enabled, the model will expand the expressive range based on the understanding of the content, making the sentences more expressive and infectious (best effect for Japanese ASMR), but may reduce generation stability.
* **Emotion Control:** This function parameter allows you to adjust the emotional tendency of paragraph pronunciation. After enabling this function, you can manually specify emotion ratios, including angry, happy, neutral, sad, and matching context. The model will try to pronounce with corresponding emotional expressions according to the set ratio. The specific effect may vary greatly for different samples and may reduce generation stability.
* **Consistency Optimization**: **Experimental feature**, when enabled, it will optimize the generation effect of long content gathered in a single paragraph, improve consistency and coherence, but may reduce expressiveness. For text content editing, you can refer to [Text Content Editing](https://docs.vocu.ai/zh/~/revisions/UXeoxR8p2d7pY0hBgCpM/generate/text-edit) for more information.
* **Post-processing Mode: Experimental feature,** this item controls the output optimization strategy. By default, it will optimize for the restoration of character voice. In addition, different options can adjust the auditory performance of the final audio. You can try to adjust this setting according to your needs.

</details>

Under the latest configuration template, you only need to input content and assign characters, then click the **Start Generation** button at the bottom to submit the speech generation task, and you can observe the generation progress in real-time, as well as quickly preview the effects of paragraphs and overall.

#### Task Queue <a href="#task-queue" id="task-queue"></a>

[View Detailed Introduction](/generate/queue.md)

You can view and manage your ongoing generation tasks here, as well as the status of all historical tasks. Task status will be updated in real-time without manual refresh. After you start a new generation task, you can see the task status of each paragraph in the editor. After opening the task queue (clock icon), your latest task will appear at the top of the list.

You can click to view [details of each task](/generate/task-detail.md) in the list, play the final generation result of each task (if any), or quickly perform audio downloads, copy to editor, delete task history and other operations in the dropdown menu of each task.

Each paragraph's task record can be viewed independently, making it convenient for you to adjust individual paragraphs.

{% hint style="info" %}
You can see the generation status of each paragraph in the project editor, and play or download audio of a paragraph individually. We also support regenerating individual paragraphs without affecting the status of other paragraphs.
{% endhint %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.vocu.ai/introduction/quick-start.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.