Audio
All articles
OpenAI Whisper - Open-Source Speech Recognition
Whisper is an open-source automatic speech recognition model by OpenAI that provides robust, multilingual …Cloud Speech-to-Text and Text-to-Speech - Voice AI Services
Google Cloud Speech-to-Text converts audio to text using deep learning, while Text-to-Speech synthesizes …Audio Transcription Pipeline Patterns
End-to-end patterns for audio transcription at scale. Pre-processing, model selection, speaker diarization, …Amazon Polly - Text-to-Speech for Applications
Using Amazon Polly to generate natural-sounding speech from text in AI applications, with SSML control and …Text-to-Speech (TTS)
What text-to-speech technology is, how AWS Polly, Azure Speech, and GCP Text-to-Speech compare, and key …Speech-to-Text (STT)
What speech-to-text technology is, how AWS Transcribe, Azure Speech, and GCP Speech-to-Text compare, and key …FFmpeg - Video Processing Swiss Army Knife
Using FFmpeg in AWS Lambda layers and EC2 for video processing in AI pipelines, including common operations …Amazon Transcribe - Speech-to-Text for Enterprise
Amazon Transcribe capabilities, accuracy characteristics, pricing, and the integration patterns that work well …
Open source projects