In Q4 2026 we'll publish the file-based 19-dimension scorer, calibrated weights, and full calibration corpus at github.com/labelsets/lqs-scorer under MIT license. You'll be able to run the same scorer on your air-gapped data and verify our published scores reproduce bit-for-bit. The methodology is the moat; closing it doesn't help us.
The 19-dimension training-data quality scorer used by labelsets.ai. Open methodology, reproducible weights, MIT license.
$ pip install labelsets-lqs # or via Docker — air-gapped tenancies $ docker pull labelsets/lqs-scorer:v3.1
# Python from labelsets_lqs import Scorer scorer = Scorer.from_pretrained('lqs-v3.1-public') result = scorer.score('./my-dataset.parquet') print(result.composite) # 87.4 print(result.dims.label_quality.score) # 91.2 (CI: 89.4–93.0) print(result.cert_payload) # dict ready for signing
Every cert at labelsets.ai/verify includes a cert_hash. Run the OSS scorer on the same dataset, sign with our public key, and confirm the hash matches. If it doesn't, the published score is wrong — file an issue and we'll investigate.
labelsets-lqs score ./dataset --out cert.jsonMIT. Use it commercially. Modify it. Redistribute it. We just ask that derivative works are clear about which version of the scorer they're using so reproducibility holds.
Closing the loop on the OSS launch. Each milestone has an owner and a date.
No drip campaign. No newsletter. One email when the repo goes public, then we go away. If you're building infra in regulated AI, you'll want to be in the first cohort to run reproducibility tests against our published scores.