AWS Polly
Amazon Polly is a cloud-based text-to-speech service that offers high-quality voice synthesis with support for multiple languages and voices.
Authentication
To use AWS Polly, you'll need AWS credentials:
from tts_wrapper import PollyTTS, PollyClient
client = PollyClient(credentials=(
'aws_region', # e.g., 'us-east-1'
'aws_key_id', # Your AWS Access Key ID
'aws_secret_key' # Your AWS Secret Access Key
))
tts = PollyTTS(client)
tip
Use environment variables or AWS credentials file for secure credential management:
import os
client = PollyClient(credentials=(
os.getenv('AWS_REGION'),
os.getenv('AWS_ACCESS_KEY_ID'),
os.getenv('AWS_SECRET_ACCESS_KEY')
))
Features
SSML Support
AWS Polly provides comprehensive SSML support:
ssml_text = """
<speak>
Hello <break time="300ms"/> World!
<prosody rate="slow" pitch="+20%">
This is a test of SSML features.
</prosody>
</speak>
"""
tts.speak(ssml_text)
See Amazon Polly SSML Reference for all supported tags.
Streaming
Polly supports real-time audio streaming:
# Stream synthesis for real-time playback
tts.speak_streamed("This text will be synthesized and played in real-time")
Word Timing
Get precise timing information for each word:
def on_word(word: str):
print(f"Speaking: {word}")
tts.connect("started-word", on_word)
tts.speak("This text will trigger word timing callbacks")
Voice Selection
List and select from available voices:
# Get list of available voices
voices = tts.get_voices()
for voice in voices:
print(f"ID: {voice['Id']}, Language: {voice['LanguageCode']}")
# Set a specific voice
tts.set_voice("Joanna", "en-US")
Best Practices
-
Cost Management
- Use streaming for long text to optimize bandwidth
- Cache frequently used phrases
- Monitor usage through AWS Console
-
Performance
- Reuse client instances
- Use appropriate sampling rates
- Consider regional endpoints for lower latency
-
Error Handling
try:
tts.speak("Hello, world!")
except Exception as e:
if "CredentialsError" in str(e):
print("Check your AWS credentials")
elif "QuotaExceeded" in str(e):
print("AWS Polly quota exceeded")
else:
print(f"Error: {e}")
Limitations
- Maximum text length of 3000 characters per request
- API rate limits apply (check AWS quotas)
- Certain SSML features limited to specific voices
- Neural voices not available in all regions
Additional Resources
Next Steps
- Learn about SSML support
- Explore streaming capabilities
- Check out callback functionality