Share Depicting Image Quality in the Wild

Copy link

June 17, 2025

Depicting Image Quality in the Wild

20 minutes

This paper introduces DepictQA-Wild, a novel Vision Language Model (VLM) designed for Image Quality Assessment (IQA), which aims to align with human perception by leveraging language descriptions. The paper addresses limitations in existing VLM-based IQA methods, specifically their limited functionality across various scenarios (e.g., single-image vs. multi-image comparison, image restoration vs. generation) and sub-optimal performance due to inadequate training data and fixed image resolutions. To overcome these issues, the authors constructed DQ-495K, a large-scale dataset featuring 35 diverse distortion types across 5 severity levels, with ground-truth informed responses generated by GPT-4V to enhance label quality. DepictQA-Wild is trained on this dataset, notably retaining original image resolution and incorporating confidence estimation to improve accuracy across tasks such as distortion identification, image assessment, and paired image comparison in both full-reference and non-reference settings.

keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Maparrow_downwardJump to bottom

...more

View all episodes

By Enoch H. Kang

June 17, 2025

Depicting Image Quality in the Wild

20 minutes

keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Maparrow_downwardJump to bottom

...more

Sign up to save your podcasts