🚗
Autonomous vehicles
LIDAR, camera, and sensor-fusion datasets for perception models. KITTI, nuScenes, and COCO formats.
🏥
Medical AI
Radiology, pathology, and dermatology datasets. De-identified, IRB-compliant, DICOM and NIfTI formats.
🔍
Fraud detection
Labeled transaction, claims, and identity fraud datasets. Pre-split train/test sets with realistic class imbalance.
🤖
LLM fine-tuning
Instruction-response pairs, domain Q&A, and preference datasets in JSONL chat format for LLaMA, Mistral, and GPT.
📦
Retail & e-commerce
Product classification, defect detection, and shelf-image datasets. YOLO and COCO formats.
🎙️
Speech & audio
Transcribed audio, speaker-diarized, and emotion-labeled datasets. WAV/FLAC with aligned text.