Quick Start

Learn how to register and generate your first speech in VOCU

Account Registration

Vocu Account

You only need your email and your specified password to log in or register and start using all of VOCU's services. When you visit the VOCU console without logging in, you will see input boxes that allow you to log in with your email and password. If you haven't registered yet, simply click the registration button at the bottom of the page to begin the registration process.

Third-Party Accounts

In addition to email login and registration, VOCU also integrates common platform accounts, as well as third-party login methods. When you register through a third-party platform account, we will obtain your account's email or ID after you log in and use it as your unique credential in VOCU to create an account.

Registration Rewards

After you log in to VOCU for the first time through any method, you will receive 1,500 points for free to experience our various services. You can receive more points for free through daily check-ins, or purchase more points in bulk through payment.

Create Character

View Detailed Introduction

In addition to using community characters, you can also create a character on the Voice Management page, upload or record audio samples for it, and then use this custom character for speech generation.

You can also summon the character creation panel by selecting the "Create New Character..." button in the popup when selecting a character on the Vocu Studio page, and use it to create a character.

Sample quality is more important than length. Noisy samples may produce poor results. Please provide high-quality sample speech as much as possible. Currently, sample speech length needs to be greater than 2 seconds and file size not exceeding 20M. You can also try to obtain high-quality vocal audio samples from any audio using vocal separation/audio noise reduction/vocal beautification/loudness normalization and other functions of some audio editing software.

Speech Generation

After you have your first character, you can start generating your first speech on the Dubbing Studio page.

Text Content Editing

View Detailed Introduction

You only need to assign a character and enter any text in the text input box on the page to start speech generation. If you need to optimize multiple paragraphs of text, you can click the Add Paragraph button to add more, or you can paste the content you need through Auto-paragraph Addition. After clicking Add, the system will automatically split your text and add it to the list above one by one.

After that, you can also edit each item in the list individually, assign characters, adjust order, delete content, insert downward, and other operations.

Generation Parameter Configuration

View Detailed Introduction

After completing content editing, you can adjust generation configuration at the gear icon of the paragraph (below each paragraph) on the page.

Generation presets have a great impact on the final generation effect. We have pre-configured relatively balanced presets as default settings. If you need to adjust, you can switch between different presets to find the most suitable effect for you.

In many cases, you can first try using the default parameter configuration for generation, and then consider adjusting if it does not achieve your expected effect.

Manually Adjustable Advanced Generation Settings
  • Presets: Used to control the performance strategy adopted by the voice during generation, determining the basic expressiveness of the voice in hearing. Different generation presets focus on different parameter directions and can determine the understanding and expressiveness between the output voice and the text (for example, using the balanced preset, the voice will balance pronunciation performance and text understanding performance, fit the understanding of content, while using the creative preset, the voice will show a more performance-oriented pronunciation method based on the context of the text, and will have relatively unique effects when dealing with different scenarios.)

  • Emotion Style: For the content input during generation, different style tendencies will optimize and restore different detail parts during processing. When selecting text-oriented, the result details will be improved according to the semantics of the input text, more in line with the text context; when selecting character-oriented, more attention will be paid to restoring the direct expressiveness of the voice character sample.

  • Generation Seed: Controls the randomness during generation. The same seed will produce similar results during generation. This value can be an integer from 1 to 2147483647. The default setting of -1 is completely random. Usually no adjustment is needed.

  • Speech Rate: Controls the speed of generated speech. The larger the value, the faster the speech speed. It can be adjusted to a value between 0.5x and 2x, with 1 being normal speed.

Unique Configuration Added in V3.0 Series Models

In the V3.0 series model, we have added some unique parameters. Paragraphs assigned with V3.0 model characters can additionally control the following settings:

  • Language: Specify the language of the content in the paragraph. By default, the system can automatically recognize the language of the input content. If the recognition is not accurate enough or the language you input is Cantonese, please manually select the language. (Currently cannot automatically recognize Cantonese content)

  • Vivid Expression: Support enabling the "Vivid Expression" option for individual paragraphs. When enabled, the model will expand the expressive range based on the understanding of the content, making the sentences more expressive and infectious (best effect for Japanese ASMR), but may reduce generation stability.

  • Emotion Control: This function parameter allows you to adjust the emotional tendency of paragraph pronunciation. After enabling this function, you can manually specify emotion ratios, including angry, happy, neutral, sad, and matching context. The model will try to pronounce with corresponding emotional expressions according to the set ratio. The specific effect may vary greatly for different samples and may reduce generation stability.

  • Consistency Optimization: Experimental feature, when enabled, it will optimize the generation effect of long content gathered in a single paragraph, improve consistency and coherence, but may reduce expressiveness. For text content editing, you can refer to Text Content Editing for more information.

  • Post-processing Mode: Experimental feature, this item controls the output optimization strategy. By default, it will optimize for the restoration of character voice. In addition, different options can adjust the auditory performance of the final audio. You can try to adjust this setting according to your needs.

Under the latest configuration template, you only need to input content and assign characters, then click the Start Generation button at the bottom to submit the speech generation task, and you can observe the generation progress in real-time, as well as quickly preview the effects of paragraphs and overall.

Task Queue

View Detailed Introduction

You can view and manage your ongoing generation tasks here, as well as the status of all historical tasks. Task status will be updated in real-time without manual refresh. After you start a new generation task, you can see the task status of each paragraph in the editor. After opening the task queue (clock icon), your latest task will appear at the top of the list.

You can click to view details of each task in the list, play the final generation result of each task (if any), or quickly perform audio downloads, copy to editor, delete task history and other operations in the dropdown menu of each task.

Each paragraph's task record can be viewed independently, making it convenient for you to adjust individual paragraphs.

You can see the generation status of each paragraph in the project editor, and play or download audio of a paragraph individually. We also support regenerating individual paragraphs without affecting the status of other paragraphs.

Last updated

Was this helpful?