Marketing^AI

Depicting Image Quality in the Wild


Listen Later

This paper introduces DepictQA-Wild, a novel Vision Language Model (VLM) designed for Image Quality Assessment (IQA), which aims to align with human perception by leveraging language descriptions. The paper addresses limitations in existing VLM-based IQA methods, specifically their limited functionality across various scenarios (e.g., single-image vs. multi-image comparison, image restoration vs. generation) and sub-optimal performance due to inadequate training data and fixed image resolutions. To overcome these issues, the authors constructed DQ-495K, a large-scale dataset featuring 35 diverse distortion types across 5 severity levels, with ground-truth informed responses generated by GPT-4V to enhance label quality. DepictQA-Wild is trained on this dataset, notably retaining original image resolution and incorporating confidence estimation to improve accuracy across tasks such as distortion identification, image assessment, and paired image comparison in both full-reference and non-reference settings.

keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Maparrow_downwardJump to bottom

...more
View all episodesView all episodes
Download on the App Store

Marketing^AIBy Enoch H. Kang