Built for legal AI
Training data built
for legal AI.
Every dataset compliance-certified, jurisdiction-mapped, and scored on 14 general plus 3 legal-specific quality dimensions. Built for legal AI companies, BigLaw innovation teams, and enterprise legal departments.
1-in-6
Benchmark contamination
The problem
Generic training data breaks in legal contexts.
The same dataset that looks great on a general-purpose leaderboard can silently fail in legal production. Three failure modes that your average quality score won't catch.
🏛
Jurisdiction gaps
A corpus labeled "US case law" often turns out to be 78% Delaware and Federal Circuit with almost no state court coverage. Your model learns the wrong doctrines and misstates the law for real clients in every other jurisdiction.
⚠
License risk
Half of public legal corpora have unclear copyright provenance — scraped, redistributed, or built on top of data with ambiguous licensing. For enterprise buyers this isn't a quality issue, it's a compliance exposure.
🔎
Benchmark contamination
About 1 in 6 public legal training datasets contain material from CaseHOLD, LegalBench, or LexGLUE test splits. Training on them silently inflates your eval metrics and breaks real-world generalization.
LQS for legal
14 general dimensions plus 3 built for legal data.
Every legal dataset is evaluated on three additional dimensions beyond the standard LabelSets Quality Score — designed for the failure modes that matter in legal-AI production.
Legal dim 1
Jurisdiction coverage
Statistical fingerprinting of the jurisdictions actually present in the text, compared to the jurisdictions the seller claims. Reports a weighted entropy across the detected jurisdictions and flags datasets where the claimed scope and the real scope disagree.
Legal dim 2
PII & privilege redaction
Dedicated PII pipeline tuned for legal text. Flags personal identifiers, case numbers, attorney-client privileged phrases, and work product markers. Reports a confidence score with exact strings that require review before training.
Legal dim 3
Clause taxonomy validity
Checks annotations against canonical clause and statute taxonomies (ISDA, ACC, CUAD, LexGLUE, EUR-Lex). Flags datasets with inconsistent or undocumented tagging — the single biggest predictor of poor generalization on contract-AI tasks.
Flagship · In stock
Legal Document Analysis Corpus
445 annotated legal documents, 6K instruction-response pairs, full compliance certificate and jurisdiction coverage report.
Ready to download
Built specifically for legal LLM fine-tuning.
Contract law, litigation filings, and regulatory analysis — annotated to a canonical clause taxonomy with jurisdiction tagging and full provenance documentation. Designed for teams building legal drafting, review, and research AI.
View the dataset →
LQS v3.0 · Legal
92
★ Platinum tier
Jurisdiction coverage94
PII redaction confidence98
Clause taxonomy validity89
Provenance quality96
License integrity100
Free audit of your current legal training data.
Upload a sample — contracts, case law, statutes, whatever you're training on — and we return a full LQS breakdown within an hour. 14 standard dimensions plus 3 legal-specific ones. Includes benchmark contamination detection and a jurisdiction coverage map.
Start a free audit →
No account required · No drip emails · Just the report
FAQ
Questions from legal AI teams.
Is the legal data you sell licensed for commercial AI training?
Every dataset ships with a compliance certificate specifying the exact license terms, source of the data, and any known restrictions. Datasets that can't pass that check don't get published. You can review the compliance posture before you buy.
How does jurisdiction coverage actually get measured?
We fingerprint the text using statutory citations, court references, and jurisdictional entity extraction, then compute a weighted entropy across the detected jurisdictions. A dataset that claims "US case law" but is 90% Delaware gets a low score on this dimension. Full methodology at
/lqs-methodology.
How do you detect benchmark contamination?
Exact-hash matching against known legal benchmarks (CaseHOLD, LegalBench, LexGLUE, SARA, CUAD) plus near-duplicate detection via MinHash LSH for paraphrased or slightly rewritten content. Datasets with detected contamination are flagged publicly on the listing page — we don't hide it.
Do you sell to BigLaw innovation teams and enterprise legal AI groups?
Can I upload a sample of our existing training data for a quality audit?
Yes. The free quality audit at
/quality-audit accepts any legal training data sample and returns a full LQS report within an hour. No account required, no drip emails, just the report.
Is the methodology peer-reviewed?
The full LQS v3.0 methodology is published at
/lqs-methodology with every formula, dimension, and pillar weight documented. We're actively building an empirical calibration suite — when those results are in, the dimension-level correlation table and updated weights will be published alongside the methodology.