Introducing the LQS: why we built a quality score for ML datasets
From the LabelSets team · April 2026
If you've ever spent a week cleaning a dataset only to discover it had 40% label noise, you already understand the problem we're trying to solve.
When we built LabelSets, we kept running into the same buyer complaint: "I can see the price and the row count. I have no idea if this data is actually good." Sellers faced the mirror problem — they knew their data was clean, but couldn't prove it in a way buyers trusted.
So we built the LabelSets Quality Score (LQS) — a 0–100 composite score that measures every dataset across seven dimensions before it ever goes live on the marketplace.
|
The 7 LQS Dimensions
|
1.
Completeness — null rates, missing fields, coverage gaps
|
|
2.
Label Consistency — class distribution balance, label noise indicators
|
|
3.
Format Validity — schema correctness, parseable structure
|
|
4.
Deduplication — duplicate row percentage, near-duplicate detection
|
|
5.
PII Risk — exposure of names, emails, phone numbers, health info
|
|
6.
Volume Adequacy — sample size relative to stated use case
|
|
7.
Annotation Quality — inter-annotator agreement signals, format consistency
|
|
Datasets scoring 85+ earn a Verified badge. Those below 60 get flagged for improvement before listing. Every score comes with a dimension-by-dimension breakdown so sellers know exactly what to fix.
Why does this matter for buyers? Because "5,000 labeled images" tells you almost nothing. An LQS of 91 tells you the dataset is clean, balanced, well-annotated, and ready to train on — without having to download it first.
We've also opened the tool up publicly. You can run a free quality audit on any dataset you own — whether you plan to sell it or just want to know if it's production-ready.
Featured Dataset
|
Customer Service Fine-Tuning
15,000 customer support conversation pairs, balanced across 12 intent categories. Ready to fine-tune GPT-4o, Llama, or Mistral.
|
|