Overview

Voice Synthesis Overview

Voice AI Labs offers two powerful voice synthesis methods: text-to-speech and voice conversion. Whether you need to convert text to speech or change the timbre of existing audio, we provide professional-grade solutions.

Core Features

📝 Text-to-Speech

Convert any text into natural, fluent speech with support for multiple languages and emotional expressions.

Key Features:

  • Supports 30+ languages and dialects
  • Rich emotional and tonal control
  • Adjustable speech rate and volume
  • Use your cloned characters or community characters
  • Real-time preview and download

Use Cases:

  • Video narration and voiceover production
  • Audiobooks and podcast content
  • Educational training materials
  • Advertising and marketing content
  • Accessibility reading services

🎵 Voice Conversion

Convert the timbre of any audio to your chosen target voice while preserving the original emotion and rhythm.

Key Features:

  • Maintains original audio emotion
  • Precise timbre conversion
  • Supports speech and singing conversion
  • One-click multi-character dubbing
  • High-quality output

Use Cases:

  • Film dubbing and role-playing
  • Song covers and music creation
  • Multi-character content production
  • Voice style unification
  • Creative audio experimentation

Feature Comparison

FeatureText-to-SpeechVoice Conversion
Input TypeTextAudio File
OutputSpeech AudioConverted Audio
Emotion ControlAdjustablePreserves Original
Speed ControlAdjustablePreserves Original
Use CaseCreate voice content from scratchChange existing audio timbre

Usage Workflow

Text-to-Speech Workflow

  1. Select Character: Choose a voice from your library or Voice Square
  2. Input Text: Enter or paste the text content to convert
  3. Adjust Parameters: Set speech rate, emotion, etc. (optional)
  4. Generate Speech: Click the generate button to start synthesis
  5. Preview & Download: Listen to the result and download the audio file

Voice Conversion Workflow

  1. Select Target Character: Choose the target voice for conversion
  2. Upload Audio: Upload source audio file or use online recording
  3. Start Conversion: Click the convert button to begin processing
  4. Preview & Download: Listen to the converted result and download

Quality Assurance

Best Practices

Text-to-Speech:

  • ✅ Use standard punctuation to control pauses
  • ✅ Use line breaks appropriately for paragraphs
  • ✅ Avoid overly long single sentences
  • ✅ Choose character voices that match the content

Voice Conversion:

  • ✅ Use clear source audio
  • ✅ Avoid excessive background noise
  • ✅ Select target characters with similar timbre
  • ✅ Maintain consistent audio quality

Quota and Limits

Different membership tiers offer varying synthesis quotas:

PlanText-to-SpeechVoice Conversion
Starter80,000 characters/month440 minutes/month
Standard270,000 characters/month1,500 minutes/month
Premium540,000 characters/month3,000 minutes/month

Subject to change, please refer to the Pricing page for current rates.

Check your current usage in the user menu.

Technical Specifications

Supported Input Formats

Text-to-Speech:

  • Plain text
  • Maximum length: 1,000 characters per request
  • Supports multilingual mixing

Voice Conversion:

  • Audio formats: WAV, MP3, OGG
  • Maximum duration: 5 minutes per request
  • Maximum file size: 20MB

Output Format

  • Format: MP3
  • Sample rate: 24kHz
  • Bit rate: 128kbps
  • Channels: Mono

Advanced Features

Batch Processing

  • Text-to-speech supports batch generation
  • Voice conversion supports queue processing
  • Automatic generation history saving

History Records

  • View all generation records
  • Re-download historical audio
  • Manage and delete records

Getting Started

Ready to start voice synthesis?

  1. Text-to-Speech: Visit Text-to-Speech for detailed usage instructions
  2. Voice Conversion: Visit Voice Conversion for conversion techniques

FAQ

Q: What's the difference between text-to-speech and voice conversion? A: Text-to-speech converts text to speech, while voice conversion changes the timbre of existing audio.

Q: Can I use my own cloned characters? A: Yes, you can use characters you've created or public characters from Voice Square.

Q: Can generated audio be used commercially? A: This depends on your plan and character authorization. See Terms of Service for details.


Continue reading to learn more about Text-to-Speech and Voice Conversion.