Amazon Polly Text-to-Speech
Amazon Polly Demo — Preview AWS Neural & Standard Voices Online
Big-Tech offering, Amazon Polly provides a pretty strong offering for our money. A strong, scalable text-to-speech service inside AWS—ideal for developers and teams that need reliability and integration with the AWS ecosystem. Polly is available in the AWS Console, via the AWS CLI, and through SDKs if you want a programmatic interface for apps and services. Choose from both neural voices and standard voices across dozens of languages and accents including English (US, UK), Spanish, French, German, Japanese, and more. With built-in SSML support, you can fine-tune pitch, speed, emphasis, and pauses for greater control, or even fix tricky pronunciations with custom lexicons. Polly is popular for e-learning, YouTube narration, podcasts, and accessibility projects, and it can handle anything from short scripts to long-form audiobook content. Pricing is pay-as-you-go—about $4 per 1M characters (Standard) and $16 per 1M (Neural), with a free tier for new AWS accounts.
Compare with our ElevenLabs Text-to-Speech and Microsoft Azure Text-to-Speech pages, or explore the full TTS providers directory.
- Neural & standard voices across dozens of languages and accents
- SSML controls for pitch, rate, emphasis, and pauses
- Custom lexicons for brand and technical terms
- Streaming or downloadable MP3/OGG output
- Pay-as-you-go pricing; AWS free tier for new accounts
Amazon Polly FAQ
What is Amazon Polly?
Amazon Polly is Amazon Web Services' text-to-speech platform that turns text into natural-sounding audio with support for neural voices.
Is Amazon Polly free?
New AWS users typically receive 5 million characters per month free for 12 months. After that, Polly is billed per million characters.
Can I use neural voices with Polly?
Yes, Polly offers advanced neural voices that sound more natural and human-like than standard voices.
Does Polly support SSML?
Yes, Polly supports SSML tags so you can control pitch, speed, emphasis, and pauses for a more tailored output.