Home·Curated Catalog·Medical Imaging
🏥 Curated Catalog · Medical Imaging

MedMNIST v2

708K medical images pre-processed to MNIST-like format across 12 datasets and 6 modalities.

LQS 89 · gold ✓ Commercial OK 708K medical images 800 MB NPY · HDF5 Released 2022
Browse commercial Medical Imaging → Visit original source ↗
Source: medmnist.com · maintained by Shanghai Jiao Tong University (Yang et al.)
708K
medical images
800 MB
Size on disk
89
LQS · gold
2022
First released

About this dataset

MedMNIST v2 is a standardized collection of 12 biomedical image datasets reformatted to MNIST-scale (28×28 or 28×28×28 for 3D) for rapid benchmarking. Covers 6 modalities — pathology, chest X-ray, OCT, ultrasound, CT, dermatoscopy — with 708,069 images total across 12 classification tasks.

License
Formats
NPY · HDF5

LabelSets Quality Score

LQS is our 7-dimension quality score, computed from the dataset's published statistics. See methodology →

89
out of 100
gold tier

High-quality dataset across most dimensions

Composite score computed from the 7 dimensions below: completeness, uniqueness, validation health, size adequacy, format compliance, label density, and class balance.

Completeness 95
No public completeness metric; using prior for 'expert_curated' datasets.
Uniqueness 93
Exact-hash deduplication documented by maintainer.
Validation 92
Labels produced by domain experts or trained annotators.
Size adequacy 95
708,069 images — exceeds 20,000 adequacy target for Medical Imaging.
Format compliance 95
Industry-standard format — drop-in compatible with mainstream tooling.
Label density 52
Average 1.0 labels per item (sparse).
Class balance 90
Near-uniform class distribution.

What it's used for

Common tasks and benchmarks where MedMNIST v2 is the default or competitive choice.

Sample statistics

What's actually in the dataset — from the maintainer's published stats.

708,069 images across 12 datasets (PathMNIST, ChestMNIST, DermaMNIST, OCTMNIST, etc). 28×28 2D or 28×28×28 3D. Classification tasks.

License

MedMNIST v2 is distributed under CC BY 4.0. This is a third-party public dataset; LabelSets indexes and scores it but does not host or redistribute the data. Always verify current license terms with the maintainer before commercial use.

Need commercial-licensed Medical Imaging data?

LabelSets sellers offer paid medical imaging datasets with what public datasets often can't give you:

Browse paid Medical Imaging → Sell your dataset

Similar public datasets

Other entries in the Medical Imaging catalog.

Frequently Asked Questions

MedMNIST v2 is distributed under CC BY 4.0, which generally permits commercial use. Always verify the current license terms with the maintainer (Shanghai Jiao Tong University (Yang et al.)) before using in a commercial product.
MedMNIST v2 contains 708,069 medical images. 708,069 images across 12 datasets (PathMNIST, ChestMNIST, DermaMNIST, OCTMNIST, etc). 28×28 2D or 28×28×28 3D. Classification tasks.
MedMNIST v2 is maintained by Shanghai Jiao Tong University (Yang et al.) and is available at https://medmnist.com. LabelSets indexes and scores this dataset for discoverability but does not redistribute it.
LQS is a 7-dimension quality score (completeness, uniqueness, validation, size adequacy, format compliance, label density, class balance) computed from the dataset's published statistics. Composite scores map to tiers: platinum (≥90), gold (≥75), silver (≥60), bronze (<60). Read the full methodology.