Python SDK · Beta · Q2 2026

pip install labelsets
Score a dataset. Three lines.

The Python SDK mirrors the hosted scoring API with one addition: local file scoring without uploading. Drop it into a Jupyter notebook, a CLI, a CI pipeline, a training loop. Every score produces the same signed cert format as the marketplace uses. Same crypto, same rating, same methodology.

Install

One command. Then you're scoring.

The SDK ships with offline scoring (local files), hosted scoring (HF/Zenodo URLs), cert verification, and a CLI. No API key required for the free tier — 100 scores/month.

pip · PyPI · python ≥ 3.9
$pip install labelsets
Also available as npx @labelsets/lqs-cli for Node. Docker image labelsets/lqs:latest for air-gapped tenancies.
Notebook walkthrough

A Jupyter cell that actually does something.

Every ML engineer's home. Below is what the SDK looks like in a real notebook. Copy the install above, paste the cells, run them against your own parquet/JSONL file.

score_my_dataset.ipynb — labelsets v3.1 Python 3.11 · jupyter
md

1. Score a local dataset

Point the scorer at a parquet / JSONL / CSV / HDF5 file. Returns composite + per-dim breakdown + a signed cert envelope ready to embed anywhere.
In [1]
from labelsets import Scorer scorer = Scorer.from_pretrained('lqs-v3.1-public') result = scorer.score('./my_dataset.parquet') print(f"LQS composite: {result.composite}") print(f"Tier: {result.tier} · confidence {result.confidence}") print(f"Contamination clean: {result.dims.contamination_clean.score}")
Out [1]
LQS composite: 87.4 Tier: gold · confidence 0.91 Contamination clean: 94.2
md

2. Score a public URL (no download)

Pass a HuggingFace or Zenodo URL — the scorer hits the public metadata API, derives signals, and signs a proxy cert. No file download required.
In [2]
result = scorer.score_url('https://huggingface.co/datasets/openai/gsm8k') print(result.composite, result.tier, result.confidence) # → 81 gold 0.4 (proxy cert — metadata-only)
md

3. Inspect every dimension

All 19 dimensions with 95% CIs + the signal that produced each score. Makes "why did it score that?" a 2-line answer.
In [3]
import pandas as pd df = pd.DataFrame([ {'dim': d.name, 'score': d.score, 'ci_low': d.ci.low, 'ci_high': d.ci.high} for d in result.dims ]) df.sort_values('score', ascending=False)
Out [3]
dim score ci_low ci_high 0 label_quality 93.1 91.2 94.8 1 schema_integrity 98.0 96.5 99.2 2 oracle_agreement 91.0 88.4 93.1 3 contamination_clean 94.2 92.1 95.8 4 downstream_headroom 88.4 85.9 90.7 ... (15 more rows)
md

4. Sign + verify offline

Same Ed25519 scheme as the marketplace. Sign with your own key for private-mode enterprise scoring; verify against any public key offline — no LabelSets server in the trust chain.
In [4]
# Sign with OUR production key (hosted SaaS path) cert = result.sign(backend='labelsets-hosted') # OR sign with your own key (enterprise private mode) cert = result.sign(private_key=open('my_ed25519.pem').read()) # Verify offline — no server contact from labelsets import verify_offline valid = verify_offline(cert, public_key=open('labelsets_pk.pem').read()) print(valid) # True / False / "revoked"
md

5. Log to W&B / MLflow alongside every training run

Gets LQS on your experiment dashboard as a first-class metric.
In [5]
import wandb wandb.log({ 'lqs.composite': result.composite, 'lqs.tier': result.tier, 'lqs.cert_hash': cert.cert_hash, # permalinks to labelsets.ai/verify?hash=... 'lqs.contamination': result.dims.contamination_clean.score, })
What's in the box

Six things the SDK does out of the box.

1
Local file scoring
Point at a parquet, JSONL, CSV, HDF5, or DICOM file. Runs the full 19-dimension scorer offline. Works in air-gapped environments with --offline.
2
URL scoring (HF / Zenodo)
Pass a public URL, get the proxy cert. Same endpoint as the live homepage demo. Confidence 0.4 (metadata-derived) by design.
3
Cert signing + verification
Sign with LabelSets' hosted key OR bring your own Ed25519 key (enterprise private-mode). Verify any cert offline against any public key.
4
CLI tool
Ships with a lqs command. lqs score ./data.parquet --out cert.json. Use in CI pipelines or cron jobs.
5
W&B + MLflow integration
One-line helper: lqs.log_to_wandb(result) or lqs.log_to_mlflow(result). LQS becomes a first-class training metric on your existing dashboard.
6
Contamination-only mode
Check if your data overlaps with MMLU / HumanEval / HellaSwag / GSM8k / 36 more benchmarks. lqs.contamination(data) returns a per-benchmark overlap rate. Worth the install price alone.
CI integration · 6-line yaml

GitHub Action — auto-score on every commit.

Drop this into .github/workflows/lqs.yml. Every commit that touches a dataset file auto-scores and posts the cert hash as a PR comment with the embed badge.

.github/workflows/lqs.yml Q2 2026
yml
name: LQS Quality Score on: [push, pull_request] jobs: score: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: labelsets/lqs-action@v1 with: dataset-path: ./data/ post-pr-comment: true fail-if-below: 75
PR comment on every push: LQS 87.4 · gold · ✓ signed with the embed-ready badge. Procurement-grade evidence for your model-risk team, generated by your CI.
Beta access

Get the SDK on day one.

Public Q2 2026. Beta access to the first 100 ML engineers who sign up — we use the feedback to tune the scorer + API surface. No drip campaign. One email when PyPI goes live.