Instruction-tuning, RLHF, sentiment, NER, Q&A. CSV, JSONL, Parquet. Every dataset is hashed against MMLU, HumanEval, HellaSwag, GSM8K and 40+ other public benchmarks so your reported numbers hold up.
Live marketplace listings filtered to NLP. Every card shows signed LQS score, contamination-clean flag against public evals, and originality signal.
LabelSets isn't another dataset dump. Every NLP corpus carries a signed cert built for the questions an eval team or procurement officer actually asks.
/verify any time. Revocation registry handles post-facto license flags.pii_scanned flag on the cert. Sellers are contractually required to remove or anonymize personal information before uploading.is_contamination_clean flag are embedded in the signed cert.Live marketplace filtered by LQS score, contamination-clean flag, format, and license. Looking for public corpora? See the curated catalog (MMLU, The Pile, C4, MS MARCO, GLUE, Wikipedia) with LQS scores.