Voice Conversion

Voice Conversion Guide

Convert the timbre of any audio to your chosen target voice while preserving the original audio's emotion, rhythm, and expression. This guide will help you master the voice conversion feature.

Accessing the Feature

Visit the Voice Conversion page to get started.

Basic Usage Workflow

Step 1: Select Target Character

Choose the target voice you want to convert to:

My Voices

Use your cloned characters
Convert to your exclusive voice
Suitable for personal brand content

Voice Square

Use community-shared characters
Try different voice styles
Explore creative possibilities

Step 2: Upload Source Audio

Provide the original audio to convert:

Method 1: File Upload

Supported formats:

File requirements:

Maximum duration: 5 minutes
Maximum size: 20MB
Recommended sample rate: 16kHz or higher

Method 2: Online Recording

Record audio directly:

Click the "Online Recording" tab
Allow microphone permission
Click record button to start
Speak or play content to convert
Click stop to finish recording

Step 3: Start Conversion

Click the "Start Conversion" button:

Processing Time:

Short audio (within 30s): 10-20 seconds
Medium audio (1-3 min): 1-2 minutes
Long audio (3-5 min): 1-3 minutes

Conversion Status:

Uploading: Uploading file
Processing: Converting timbre
Completed: Ready to listen and download
Failed: View error message

Step 4: Preview and Download

Online Preview:

Compare original and converted audio
Check timbre conversion effect
Verify emotion preservation

Download Audio:

Format: MP3
Sample rate: 24kHz
Bit rate: 128kbps
Preserves original audio duration

Core Features

Timbre Conversion

Preserved Content:

✅ Original emotional expression
✅ Speaking rhythm and pace
✅ Speech content and pronunciation
✅ Volume variations and emphasis

Changed Content:

🔄 Voice timbre and characteristics
🔄 Pitch and range
🔄 Voice age and gender traits

Use Cases

Speech Content Conversion

Convert interviews to unified voice
Multi-character dubbing production
Voice style unification
Anonymization processing

Singing Conversion

Song cover creation
Voice style experimentation
Music production assistance
Creative audio art

Advanced Tips

Choosing the Right Target Character

Timbre Matching Principles:

Similar timbre conversion works better:

Male voice → Male voice
Female voice → Female voice
Adult voice → Adult voice
Child voice → Child voice

Cross-type conversion:

May have quality loss
Suitable for creative experiments
Requires multiple attempts

Source Audio Quality Optimization

Best Source Audio:

✅ Clear vocals
✅ Minimal background noise
✅ Stable volume
✅ Standard sample rate

Avoid Using:

❌ Multiple people speaking simultaneously
❌ Severely distorted audio
❌ Over-compressed audio
❌ Excessive background music

Handling Background Music

Audio with Background Music:

Option 1: Vocal Separation

Use audio editing software to extract vocals
Convert only the vocal part
Mix background music in post-production

Option 2: Direct Conversion

Background music will be preserved
May affect conversion quality
Suitable for minimal background music

Common Application Scenarios

Scenario 1: Video Dubbing

Requirement: Dub video content with unified voice

Workflow:

Record or prepare voiceover speech
Select target character (brand voice)
Convert all dubbing segments
Use in video editing software

Advantages:

Maintain voice consistency
Save professional dubbing costs
Quick iteration and modification

Scenario 2: Multi-Character Content

Requirement: One person voicing multiple characters

Workflow:

Record all character lines with your own voice
Select different target voices for each character
Convert each character's audio separately
Merge to create final content

Advantages:

Reduce production costs
Full control over timing and rhythm
Flexible adjustments and modifications

Scenario 3: Song Covers

Requirement: Sing songs with different voices

Workflow:

Prepare a cappella or accompaniment version
Select target singer voice
Convert singing audio
Mix to create complete version

Notes:

Respect original copyright
For personal learning or with authorization only
Indicate use of AI technology

Scenario 4: Voice Anonymization

Requirement: Protect speaker identity

Workflow:

Upload audio requiring anonymization
Select completely different voice
Convert and verify effect
Use converted audio

Applications:

News interview protection
Privacy content processing
Sensitive information sharing

Quality Optimization Tips

Achieving Best Conversion Results

Source Audio Preparation:

Use high-quality recording equipment
Record in quiet environment
Maintain stable volume
Avoid over-processing

Target Character Selection:

Choose characters with similar timbre
Listen to character samples
Compare multiple tests
Select best match

Parameter Adjustment:

If timbre deviation is large, try other characters
If emotion is lost, check source audio quality
If there's noise, use audio noise reduction tools

History Management

View Conversion History

View on the right side of the page:

Recent conversion records
Character information used
Conversion time and status

Re-download

No need to reconvert
Download historical results directly
Save quota and time

Quota Management

Duration Calculation

Calculated by source audio duration:

In minutes
Less than 1 minute counted as 1 minute
Failed conversions don't deduct quota

Check Usage

Plan	Text-to-Speech	Voice Conversion
Starter	80,000 characters/month	440 minutes/month
Standard	270,000 characters/month	1,500 minutes/month
Premium	540,000 characters/month	3,000 minutes/month

Subject to change, please refer to the Pricing page for current rates.

Quota Optimization

Merge short audio before conversion
Delete failed conversion records
Upgrade plan for more quota

Technical Limitations

Current Limitations

Unsupported Content:

Severely distorted or damaged audio
Non-vocal content (pure music, ambient sounds)

Quality Affecting Factors:

Source audio quality
Target character match degree
Audio complexity

Troubleshooting

Conversion Failed

Common Causes:

Unsupported audio format
File too large or too long
Audio quality too low
Contains violating content

Solutions:

Convert audio format
Trim audio length
Improve audio quality
Check content compliance

Unsatisfactory Results

Large Timbre Deviation:

Change target character
Select character with similar timbre
Adjust source audio quality

Lost Emotion:

Check source audio clarity
Ensure obvious emotional expression
Try different characters

Noise or Distortion:

Use noise reduction tools on source audio
Improve source audio quality
Reduce background noise

Copyright and Usage Guidelines

Usage Standards

Allowed:

✅ Personal learning and experimentation
✅ Authorized commercial use
✅ Voice conversion of original content

Prohibited:

❌ Unauthorized commercial use
❌ Infringing others' copyright
❌ Creating illegal or violating content
❌ Impersonating others' identity

Disclaimer

Users are responsible for converted content
Comply with local laws and regulations
Respect intellectual property
Use reasonably and legally

Best Practices Summary

Prepare High-Quality Source Audio - Clear, noise-free
Choose Matching Target Character - Similar timbre works better
Preview and Verify Results - Ensure satisfaction before download
Use Quota Reasonably - Optimize audio length
Follow Usage Guidelines - Legal and compliant use

Next Steps

Learn about Text-to-Speech feature
Check Troubleshooting for help
Visit FAQ for more information

Need help? Contact [email protected]

Voice Conversion