September 11, 2025

015 - Training data vs validation data vs test data

4 minutes

How do we know if a medical AI has truly learned to spot disease, or just memorised the answers to its practice questions? The same way we evaluate a trainee: with a final, unseen exam.

This crucial process involves splitting data into three sets: training data (the textbook), validation data (the mock exam), and test data (the final exam). In this episode of The Health AI Brief, we explain why this split is our best defence against overconfident AI, what 'overfitting' means for clinical practice, and why the 'test set' result is the only number you should trust when appraising a new AI study.

#TrainingData #ValidationData #TestData #Overfitting #ModelValidation #ArtificialIntelligence #MachineLearning #HealthcareAI #MedicalAI #ClinicalAI #CriticalAppraisal #EvidenceBasedMedicine #DigitalHealth #ai in medicine Music generated by Mubert https://mubert.com/render

[email protected]

...more

View all episodes

By Stephen Auger

September 11, 2025

015 - Training data vs validation data vs test data

4 minutes

How do we know if a medical AI has truly learned to spot disease, or just memorised the answers to its practice questions? The same way we evaluate a trainee: with a final, unseen exam.

[email protected]

...more

Share 015 - Training data vs validation data vs test data

Sign up to save your podcasts

015 - Training data vs validation data vs test data

015 - Training data vs validation data vs test data