🎙 Curated Catalog · Audio

AudioSet

2.1M 10-second YouTube clips labeled across 527 audio event classes.

LQS 84 · gold ✓ Commercial OK 2.1M audio clips 2.4 TB CSV · TFRecord Released 2017
Browse commercial Audio → Visit original source ↗
Source: research.google.com · maintained by Google Research
2.1M
audio clips
2.4 TB
Size on disk
84
LQS · gold
2017
First released

About this dataset

AudioSet is Google's large-scale sound event dataset. 2.1M 10-second audio clips extracted from YouTube, labeled with 527 audio event classes organized into an ontology (speech, music, vehicle, animal, environment, etc.). Google releases the metadata + YouTube video IDs only; audio must be fetched independently.

Maintainer
Formats
CSV · TFRecord

LabelSets Quality Score

LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →

84
out of 100
gold tier

Solid dataset with some trade-offs

Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.

Completeness 88
No public completeness metric; using prior for 'crowdsourced_qc' datasets.
Uniqueness 93
Exact-hash deduplication documented by maintainer.
Validation 82
Crowdsourced labels with quality-control protocol (redundancy, golden tests).
Size adequacy 94
2,100,000 clips — exceeds 100,000 adequacy target for Audio.
Format compliance 95
Industry-standard format — drop-in compatible with mainstream tooling.
Label density 62
Average 2.5 labels per item (moderate).
Class balance 58
Long-tail distribution — dominant classes overrepresented.

What it's used for

Common tasks and benchmarks where AudioSet is the default or competitive choice.

Sample statistics

What's actually in the dataset — from the maintainer's published stats.

2.1M 10s clips, 527 classes, avg 2-3 labels/clip. YouTube-sourced (some videos may be unavailable over time). VGGish embeddings also published.

License

AudioSet is distributed under CC BY 4.0 (labels). This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.

Need commercial-licensed Audio data?

LabelSets sellers offer paid audio datasets with what public datasets often can't give you:

Browse paid Audio → Sell your dataset

Similar public datasets

Other entries in the Audio catalog.

Frequently Asked Questions

AudioSet is distributed under CC BY 4.0 (labels), which generally permits commercial use. Always verify the current license terms with the maintainer (Google Research) before using in a commercial product.
AudioSet contains 2,100,000 audio clips. 2.1M 10s clips, 527 classes, avg 2-3 labels/clip. YouTube-sourced (some videos may be unavailable over time). VGGish embeddings also published.
AudioSet is maintained by Google Research and is available at https://research.google.com/audioset/download.html. LabelSets indexes and scores this dataset for discoverability but does not redistribute it.
LQS is a 7-dimension quality score (completeness, uniqueness, validation, size adequacy, format compliance, label density, class balance) computed from the dataset's published statistics. Composite scores map to tiers: platinum (≥90), gold (≥75), silver (≥60), bronze (<60). Read the full methodology.