Labeled speech and audio datasets for ASR, speaker recognition, emotion detection, and more. WAV and FLAC files with transcript annotations, quality-verified.
Browse Audio Datasets → Sell Your DatasetTraining data for voice, speech, and audio understanding models across every major task.
Transcribed speech datasets across accents, languages, and environments. Clean and noisy conditions for robust model training.
Multi-speaker recordings with speaker-turn labels for meeting transcription, call center AI, and podcast indexing.
Audio clips labeled with emotional state — anger, happiness, sadness, neutral — for sentiment-aware voice AI.
Short audio clips for wake word detection and keyword spotting, labeled with target and non-target classes.
Regional accent collections for improving ASR robustness across English dialects and non-native speaker speech.
Music genre classification, instrument recognition, and environmental sound detection datasets.
Browse verified audio and speech datasets — or sell your labeled audio data today.
Browse Audio Datasets → Sell Your Dataset