Home·Curated Catalog·NLP / Text
💬 Curated Catalog · NLP / Text

Natural Questions — Open-Domain QA

Real anonymised Google search queries answered from Wikipedia — the original open-domain QA benchmark.

LQS 87 · gold ✓ Commercial OK 323K question-answer pairs 42 GB JSONL Released 2019
Browse commercial NLP / Text → Visit original source ↗
Source: github.com · maintained by Google Research
323K
question-answer pairs
42 GB
Size on disk
87
LQS · gold
2019
First released

About this dataset

Natural Questions (NQ) was built by Google Research from real anonymised queries issued to Google Search. Each example pairs a question with a full Wikipedia article; annotators mark a long answer (a containing paragraph / table) and a short answer (extracted span or yes/no) when present. Its realistic-query distribution and open-domain formulation made it the de-facto benchmark for retrieval-augmented QA systems.

Maintainer
License
Formats
JSONL

LabelSets Quality Score

LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →

87
out of 100
gold tier

High-quality dataset across most dimensions

Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.

Completeness 95
No public completeness metric; using prior for 'expert_curated' datasets.
Uniqueness 93
Exact-hash deduplication documented by maintainer.
Validation 92
Labels produced by domain experts or trained annotators.
Size adequacy 92
323,000 pairs — exceeds 100,000 adequacy target for NLP / Text.
Format compliance 95
Industry-standard format — drop-in compatible with mainstream tooling.
Label density 52
Average 1.0 labels per item (sparse).
Class balance 75
Moderate class skew — realistic production distribution.

What it's used for

Common tasks and benchmarks where Natural Questions — Open-Domain QA is the default or competitive choice.

Sample statistics

What's actually in the dataset — from the maintainer's published stats.

307K train / 7.8K dev / 7.8K test; long-answer + short-answer + yes/no annotations per example.

License

Natural Questions — Open-Domain QA is distributed under CC BY-SA 3.0. This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.

Need commercial-licensed NLP / Text data?

LabelSets sellers offer paid nlp / text datasets with what public datasets often can't give you:

Browse paid NLP / Text → Sell your dataset

Similar public datasets

Other entries in the NLP / Text catalog.

Frequently Asked Questions

Natural Questions — Open-Domain QA is distributed under CC BY-SA 3.0, which generally permits commercial use. Always verify the current license terms with the maintainer (Google Research) before using in a commercial product.
Natural Questions — Open-Domain QA contains 323,000 question-answer pairs. 307K train / 7.8K dev / 7.8K test; long-answer + short-answer + yes/no annotations per example.
Natural Questions — Open-Domain QA is maintained by Google Research and is available at https://github.com/google-research-datasets/natural-questions. LabelSets indexes and scores this dataset for discoverability but does not redistribute it.
LQS is a 7-dimension quality score (completeness, uniqueness, validation, size adequacy, format compliance, label density, class balance) computed from the dataset's published statistics. Composite scores map to tiers: platinum (≥90), gold (≥75), silver (≥60), bronze (<60). Read the full methodology.