Text Content Editing
Learn here how to edit the text content you want to generate
After entering the speech synthesis page, you only need to input the content you need to synthesize on the page to complete the addition of content.


You can also click the Auto-paragraph Addition button at the bottom according to your needs. In the text input box that pops up, assign a character and select a style (if available), input any text, and click the Confirm Add button. The system will automatically split your text into different paragraphs based on punctuation and add them to the list above one by one.



If you want to clear all content in the editor, just click the Clear All button at the bottom of the page. You can also enter the character selection page to quickly create a new character.


After that, you can also edit, assign characters, adjust order, delete content, insert downward, and generate individually for each item in the list.
For the content block of each paragraph, you can also click the emoji icon to quickly insert mood words when editing text, adding more vivid intonation effects to text expression. The content block of each paragraph can independently adjust generation configuration. For a specific introduction to generation configuration, you can refer to the Generation Configuration description in the next chapter.



The number on the left side of the paragraph can be used for sorting and multi-selection. Click to select one or more paragraphs. You can uniformly process the selected paragraphs in the toolbar below, such as generating selected paragraphs, replacing characters, batch downloading selected paragraph audio, etc. At the same time, long press and drag the mark to sort paragraphs.
Currently, each paragraph can enter up to 1,200 characters. Since each modification of the content of a single paragraph requires regeneration of the corresponding paragraph, if you may need to adjust frequently during use, we recommend that you control the content length of a single paragraph to 50 to 100 words, and keep it as a single or several complete coherent sentences without extra line breaks or extra spaces. This can facilitate optimization of audio details and subsequent editing.
V2 series models (V2.9) only support Chinese and English. When using V2 series models, please ensure that the input text does not contain any non-Chinese and English characters, such as Japanese and Korean, otherwise it may cause generation failure and other issues.
Starting from V3 series, we have added Cantonese, Japanese, Korean, French, German, Spanish and Portuguese in addition to Chinese and English, as well as more than 30 accent variants of these languages in total. Please ensure that the model version and text content you use are in line with the corresponding support capabilities.
Last updated
Was this helpful?