LQS Standard — Open specification for AI training-data quality ratings

Specification

LQS v3.1, at a glance.

The fields below define what a conforming scorer MUST produce. Implementations that disagree on any line are not LQS-compatible.

Spec name

LabelSets Quality Score (LQS) v3.1 frozen

Spec version

3.1.0 · semver applies. Breaking changes bump major; new dims bump minor; weight retunes bump patch.

Methodology paper

labelsets.ai/lqs-methodology · arXiv submission target Q2 2026 in flight

Standards body

Independent until Q3 2026. IEEE working group submission target Q3 2026 (cs.AI / data quality). drafting

Reference implementation

github.com/labelsets/lqs-scorer · MIT license launching Q4 2026

Conformance test vectors

Published alongside reference implementation. SHA-256 of canonical payload + expected signature for each test case. Conformant scorers MUST pass all vectors. Q4 2026

Cryptographic signing

Ed25519 (RFC 8032). Canonical JSON: keys sorted alphabetically, no whitespace, recursive. Signature on canonical bytes only.

Public key fingerprint (LabelSets)

aa4c070af907e2ea · published at /api/lqs-public-key

Revocation registry

Public, append-only. Cert hash + reason + revoked_at. Verifiers MUST check before accepting a cert.

Dimensions

19 (label_quality, schema_integrity, oracle_agreement, contamination_clean, downstream_headroom, adversarial_stability, subgroup_equity, coverage_breadth, duplication_rate, class_balance, drift_stability, noise_robustness, document_integrity, annotation_depth, temporal_coverage, source_diversity, lineage_completeness, license_clarity, reproducibility). See methodology for definitions + per-dim formulas.

Confidence intervals

Every dimension reports its score AND a 95% CI. Method varies by dim (Wilson, bootstrap, fold-stdev, etc.) — published in spec.

Composite score range

0–100 (rounded to one decimal). Tier mapping: ≥90 platinum · ≥75 gold · ≥60 silver · <60 bronze.

Design principles

Six principles. Hard constraints, not preferences.

Every spec change has to clear these. If a "v3.2 feature" violates one, it doesn't ship — full stop.

Principle 01

Independently verifiable

Anyone with the public key can verify a cert offline. No LabelSets server, no API call, no trust chain. crypto.verify(null, canonical(payload), publicKey, signature) is the entire protocol.

Principle 02

Reproducible

Given the same dataset hash + spec version, every conformant scorer produces the same composite score. Test vectors enforce this. Differences > epsilon are bugs, not opinions.

Principle 03

Honestly uncertain

Every score ships with its confidence. Two-tier model: file-based (≥0.85) and metadata-proxy (0.4). Procurement decisions cite confidence alongside score, never the score alone.

Principle 04

Open methodology, signed implementation

The methodology is public; the calibration corpus is public; the reference implementation is MIT. What's not free is the LabelSets signing key. That's the rating-agency analog.

Principle 05

Versioned forever

Once a spec version is frozen, certs issued under it are forever verifiable under THAT version's rules. v3.2 doesn't retroactively re-score v3.1 certs. Audit trails don't move.

Principle 06

Procurement-citable

Every field in the cert maps to a documented procurement use case (SR 11-7 model risk, EU AI Act Art. 10, §1557 subgroup equity, FDA 21 CFR 11). Fields without a documented use case don't ship.

Cert format · canonical example

What a conformant cert looks like.

Below is a real cert from the production scorer. Conformant implementations produce the same field shape, same canonicalization, same signature scheme. Only the signing key differs.

// Canonical JSON payload — keys sorted alphabetically, no whitespace.
// Signed with Ed25519. Hash is SHA-256 of canonical bytes.
{
  "v": "1.0",
  "issuer": "Labelsets LLC",
  "issuer_id": "lqs-labelsets-v3.1",
  "methodology_url": "https://labelsets.ai/lqs-methodology#v3.1",
  "cert_kind": "file_based",
  "dataset_id": "ds_3f1a9c...",
  "scored_at": "2026-04-24T22:46:42.134Z",
  "scorer_version": "3.1.0",
  "composite": 87.4,
  "composite_ci_95": { "low": 85.3, "high": 89.5, "se": 1.05 },
  "tier": "gold",
  "confidence": 0.91,
  "verified": true,
  "oracle_consensus": { "agreement_score": 0.91, "oracles_counted": 7, "kappa_fleiss": 0.88 },
  "contamination": { "score": 100, "benchmarks_checked": 42, "worst_benchmark": null },
  "task_conditional": { "task": "classification", "composite": 88.2 },
  "downstream_projection": { "tier": "high", "headroom_score": 88.4 }
}

// Envelope
{
  "payload": { /* the above */ },
  "signature_b64": "TR/EG2GamAPVPSdgOSipqkZAbMrpT73Xoiv0DN/P...",
  "cert_hash": "f92b6e035715439f27e03851092d74d351703b0a...",
  "public_key_id": "df1cdaaee88bf201",
  "algorithm": "Ed25519",
  "canonical_serialization": "keys_sorted_minified_utf8"
}

Implement against the spec

Build LQS-compatible. Use our key.

If you ship a tool that produces LQS-shaped certs, file under our public test vectors and we'll list you as a conformant implementation. If you want to issue certs that verify against the LabelSets public key, partner with us — that's the rating-agency relationship.

Contact spec WG → Methodology paper

An open standard for training-data quality ratings.