Every LQS Index is a deterministic selection on top of the published LQS v3.1 score. No editorial committee. No paid placement. The rules below are machine-readable and applied identically across all datasets each rebalance.
Every constituent's LQS composite is computed by the LQS v3.1 scorer — 19 dimensions, multi-oracle consensus across 7 algorithm families, contamination scoring against 40+ public benchmarks, and Wilson + bootstrap confidence intervals. Full methodology lives at /lqs-methodology. The scorer outputs a number in [0, 100] with a 95% CI.
Indices use only the point estimate. Tie-breaking on equal composite uses lower CI bound, then alphabetical title.
Each ticker is parameterized by a JSON selection rule stored in public.lqs_index.selection_rule. The rules are applied verbatim by tools/lqs-index-build.js on every rebalance.
| Ticker | Rule |
|---|---|
| LQS-FINANCE-TOP25 | Top 25 marketplace datasets in category=Financial / Crypto, ranked by composite, min 60. |
| LQS-NLP-TOP25 | Top 25 marketplace datasets in category=NLP / Text, ranked by composite, min 60. |
| LQS-MEDICAL-TOP10 | Top 10 marketplace datasets in category=Medical Imaging, ranked by composite, min 60. |
| LQS-VERIFIED-PLATINUM | Open-membership: every dataset (marketplace or catalog) at composite ≥ 90. Marketplace constituents must additionally have a current LabelSets-issued v3.1 cert. |
| LQS-PROCUREMENT-GRADE | Open-membership: marketplace datasets at composite ≥ 75 with a valid v3.1 cert, oracle_agreement ≥ 70 (where measured), and contamination_clean ≥ 80 (where measured). |
| LQS-MARKETPLACE-CORE-50 | Top 50 across the entire marketplace, no category filter. The cross-cutting broad-market benchmark. |
| LQS-CATALOG-GLOBAL-100 | Top 100 across the public catalog of audited HuggingFace and Kaggle datasets. Tracks the public ML data landscape rather than the marketplace specifically. |
An index's reported composite is the equal-weighted mean of constituent composites. We chose equal-weighting deliberately: weighting by dataset size (item_count) would pull the index toward whichever vendor uploaded a 10M-row dump that morning. Weighting by price would create a circular incentive. Equal-weighting forces every constituent to earn its slot by quality alone.
Default cadence is weekly. On a rebalance:
removed_at = now() in public.lqs_index_constituents (records are append-only — never deleted).added_at.public.lqs_index_history.Rebalances run via tools/lqs-index-build.js. The script is idempotent — running it twice produces no diff.
aa4c070af907e2ea). Anyone with the public key can verify the cert offline. This means index membership is independently auditable.The index is queryable directly. No auth, no rate limits beyond the standard infra protections, JSON output:
# All tickers, current state
GET /api/lqs-index
# Single ticker with full constituent list and selection rule
GET /api/lqs-index/LQS-FINANCE-TOP25
# Daily history (composite, n_constituents, additions/removals)
GET /api/lqs-index/LQS-FINANCE-TOP25/history
This methodology document is versioned. Substantive changes — new tickers, changes to selection rules, changes to the rebalance cadence — are announced on the blog with at least 14 days of notice before rebalance, and recorded in the audit log.
Bug fixes (e.g. typos, code path changes that don't affect outputs) take effect immediately and are noted in the changelog without a notice period.
The methodology, the selection rules, and the public API output are released under CC-BY-4.0. Cite as: "LabelSets LQS Index, methodology v1.0, retrieved {{date}}."
The cert authority Ed25519 keypair fingerprint is aa4c070af907e2ea. Public key at /api/lqs-public-key.