AWS Polly

Amazon Polly is a cloud-based text-to-speech service that offers high-quality voice synthesis with support for multiple languages and voices.

Authentication

To use AWS Polly, you'll need AWS credentials:

from tts_wrapper import PollyTTS, PollyClient

client = PollyClient(credentials=(
    'aws_region',      # e.g., 'us-east-1'
    'aws_key_id',      # Your AWS Access Key ID
    'aws_secret_key'   # Your AWS Secret Access Key
))

tts = PollyTTS(client)

tip

Use environment variables or AWS credentials file for secure credential management:

import os

client = PollyClient(credentials=(
    os.getenv('AWS_REGION'),
    os.getenv('AWS_ACCESS_KEY_ID'),
    os.getenv('AWS_SECRET_ACCESS_KEY')
))

Features

SSML Support

AWS Polly provides comprehensive SSML support:

ssml_text = """
<speak>
    Hello <break time="300ms"/> World!
    <prosody rate="slow" pitch="+20%">
        This is a test of SSML features.
    </prosody>
</speak>
"""
tts.speak(ssml_text)

See Amazon Polly SSML Reference for all supported tags.

Streaming

Polly supports real-time audio streaming:

# Stream synthesis for real-time playback
tts.speak_streamed("This text will be synthesized and played in real-time")

Word Timing

Get precise timing information for each word:

def on_word(word: str):
    print(f"Speaking: {word}")

tts.connect("started-word", on_word)
tts.speak("This text will trigger word timing callbacks")

Voice Selection

List and select from available voices:

# Get list of available voices
voices = tts.get_voices()
for voice in voices:
    print(f"ID: {voice['Id']}, Language: {voice['LanguageCode']}")

# Set a specific voice
tts.set_voice("Joanna", "en-US")

Best Practices

Cost Management
- Use streaming for long text to optimize bandwidth
- Cache frequently used phrases
- Monitor usage through AWS Console
Performance
- Reuse client instances
- Use appropriate sampling rates
- Consider regional endpoints for lower latency

Error Handling

try:
    tts.speak("Hello, world!")
except Exception as e:
    if "CredentialsError" in str(e):
        print("Check your AWS credentials")
    elif "QuotaExceeded" in str(e):
        print("AWS Polly quota exceeded")
    else:
        print(f"Error: {e}")

Limitations

Maximum text length of 3000 characters per request
API rate limits apply (check AWS quotas)
Certain SSML features limited to specific voices
Neural voices not available in all regions

AWS Polly

Authentication

Features

SSML Support

Streaming

Word Timing

Voice Selection

Best Practices

Limitations

Additional Resources

Next Steps

Authentication​

Features​

SSML Support​

Streaming​

Word Timing​

Voice Selection​

Best Practices​

Limitations​

Additional Resources​

Next Steps​

Authentication

Features

SSML Support

Streaming

Word Timing

Voice Selection

Best Practices

Limitations

Additional Resources

Next Steps