Quick Start

Learn how to register and generate your first speech in VOCU

Account Registration

Vocu Account

You only need your email and specified password to log in or register and start using all VOCU services. When you visit the VOCU Console while not logged in, you will see input boxes for logging in with your email and password; if you haven't registered yet, just click the registration button at the bottom of the page to start the registration process.

Third-party Accounts

In addition to email login and registration, VOCU also integrates common platform accounts and Yanyu Pass as optional third-party login methods. When you register through a third-party platform account, we will obtain your account's email or ID after you log in and use it as your unique credential to create an account in VOCU.

Through Yanyu Pass, you can use a China mainland mobile phone number that was previously registered with Yanyu Technology products to log in to Vocu.ai, and can directly consume existing points in the corresponding account.

Registration Rewards

After you first log in to VOCU through any method, you will receive 1500 points for free to experience our various services. You can get more points for free through daily check-ins, or purchase more points in bulk through payment.

Creating Characters

View detailed introduction

In addition to using community characters, you can also create a character on the Character Management page, upload or record audio samples for it, and then use this custom character for speech generation.

You can also use the "Quick Create New Character..." button in the bottom left corner of the Speech Generation page to quickly create a character.

Sample quality is more important than length. Noisy samples may produce poor results, so please provide high-quality sample audio whenever possible. Currently, instant cloning sample audio length needs to be greater than 2 seconds and file size not exceeding 10M. You can use the voice separation/audio noise reduction/voice beautification/loudness standardization functions of CapCut PC version to easily obtain high-quality human voice audio samples from any audio.

Speech Generation

After you have your first character, you can start generating your first speech on the Speech Generation page.

Text Content Editing

View detailed introduction

You just need to assign a character in the text input box at the bottom of the page, enter any text, and click the Add Paragraph button. The system will automatically split by sentences and add them to the list above one by one.

After that, you can also individually edit each item in the list, assign characters, adjust order, delete content, insert below, and other operations.

Generation Parameter Configuration

View detailed introduction

After completing content editing, you can adjust the generation configuration in the right sidebar of the page (on mobile, you need to scroll down).

Generation parameters have a significant impact on the final generation effect. If you need to adjust them, it is recommended to start with the default parameters of different generation styles and make small adjustments to their values to find the most suitable effect for you.

In many cases, you can first try using default parameters for generation, and only consider parameter adjustment if the expected effect is not achieved.

Currently adjustable generation configurations
  • Diversity: The higher the diversity value, the greater the range of variation in generation results, and the more likely to generate results with stronger expressiveness, but also more unstable. This parameter should be an integer from 0-100. You can try slowly reducing this parameter and continuously testing to find a value that is most suitable for a certain character or current text.

  • Stability Filtering: Controls the strength of the generation result stability filter, should be an integer from 0-100, when set to 0 it means not enabled. When this parameter is not 0, the smaller the value, the stronger the effect, and the more stable the generation results, but it will also reduce the expressiveness of the generation results. We do not recommend **setting it to too small a value when enabled, as too small values will actually cause significant negative effects on generation results. You can try starting from 100 after enabling it, slowly reducing this parameter and continuously testing to find a value that is most suitable for a certain character or current text.

  • Probability Preference: During generation, only select from paths where the probability sum is greater than n from large to small. The smaller the value, the more bland and stable the generated speech is usually, but it may also cause degraded or abnormal effects when expressing certain content or timbres; should be an integer from 0 to 100.

After the parameters are set, you just need to click the Start Generation button below to submit the speech generation task. After the task is submitted, the system will switch to the Task Queue tab, where you can view the status of current and historical tasks.

Task Queue

View detailed introduction

You can view and manage your ongoing generation tasks here, as well as the status of all historical tasks. Task status will be updated in real time, and you don't need to manually refresh. Usually, after you start a new generation task, you will be redirected here, and your latest task will appear at the top of the list.

You can click to view details of each task in the list, play the final generation result of each task (if any), or quickly perform operations such as audio download, copy to editor, delete task history, etc. in the dropdown menu of each task.

You can individually view the generation status of each sentence in each task's detail page, and individually play or download the audio of a sentence. We will also soon support regenerating certain sentences here without affecting the status of other sentences.

Last updated

Was this helpful?