SSML Support
Speech Synthesis Markup Language (SSML) support in TTS Wrapper is engine-dependent. Each engine has its own SSML implementation that maps to what the underlying service supports.
Using SSML
Each TTS engine provides an SSML handler through the ssml
property:
from tts_wrapper import PollyClient, PollyTTS
client = PollyClient(credentials=('region', 'key_id', 'access_key'))
tts = PollyTTS(client)
# Create SSML using the handler
ssml_text = tts.ssml.add('Hello <break time="500ms"/> world!')
tts.speak(ssml_text)
Engine-Specific SSML
AWS Polly
Supports the full range of SSML tags as documented in the Amazon Polly SSML Reference.
Google Cloud
Follows the Google Cloud Text-to-Speech SSML Reference.
Microsoft Azure
Implements SSML according to the Microsoft Speech Service SSML Reference.
IBM Watson
Uses SSML as specified in the IBM Watson SSML Reference.
AVSynth (macOS)
Converts SSML to AVSpeechSynthesizer commands:
<break>
→[[slnc ms]]
<prosody>
→ Rate/pitch/volume commands- Other tags are stripped to plain text
eSpeak
Supports basic SSML tags with its own extensions. See the eSpeak Documentation.
SAPI (Windows)
Limited SSML support through the Windows Speech API.
ElevenLabs, Play.HT, Wit.ai
These engines do not support SSML natively. SSML tags will be stripped and the plain text content will be used.
SSML Helper Methods
Each engine's SSML handler provides helper methods for common operations:
# Clear any existing SSML content
tts.ssml.clear_ssml()
# Add text (may be plain text or SSML depending on engine)
tts.ssml.add("Text to speak")
# Get plain text (strips SSML tags)
plain_text = tts.ssml.get_text()
Best Practices
- Check Engine Support: Always check the specific engine's documentation for supported SSML features
- Graceful Degradation: Provide plain text alternatives for unsupported SSML features
- Engine-Specific Features: Use engine-specific SSML features when needed for better control
- Test Thoroughly: Test SSML across different engines if your application needs to support multiple engines
Example: Engine-Specific SSML
Here's how to handle SSML across different engines:
def speak_with_pause(tts, text: str, pause_ms: int = 500) -> None:
"""Demonstrate SSML handling across engines."""
# Get the SSML handler for this engine
ssml = tts.ssml
ssml.clear_ssml()
if isinstance(tts, PollyTTS):
# Use Polly-specific SSML
ssml_text = f'<speak>First part <break time="{pause_ms}ms"/> Second part</speak>'
elif isinstance(tts, AVSynthTTS):
# Use AVSynth command format
ssml_text = f'First part [[slnc {pause_ms}]] Second part'
else:
# For engines without SSML support, just use plain text
ssml_text = f'First part. Second part'
tts.speak(ssml.add(ssml_text))
Next Steps
- Learn about audio control features for playback manipulation
- Explore streaming capabilities for real-time synthesis
- Check out callback functionality for speech events