
Sign up to save your podcasts
Or
This paper introduces DepictQA, a novel approach to Image Quality Assessment (IQA) that moves beyond traditional score-based methods by leveraging Multi-modal Large Language Models (MLLMs). Unlike conventional IQA that outputs numerical scores, DepictQA provides language-based, human-like evaluations, describing image content and distortions descriptively and comparatively. To achieve this, the authors developed a hierarchical task framework (Quality Description, Quality Comparison, Comparison Reasoning) and created the M-BAPPS dataset, which includes detailed and brief text descriptions for image quality evaluation. The research demonstrates that DepictQA outperforms score-based methods and general MLLMs in aligning with human judgment, especially in complex scenarios involving image misalignment or multiple distortions, and can even be extended to non-reference applications.
keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Map
This paper introduces DepictQA, a novel approach to Image Quality Assessment (IQA) that moves beyond traditional score-based methods by leveraging Multi-modal Large Language Models (MLLMs). Unlike conventional IQA that outputs numerical scores, DepictQA provides language-based, human-like evaluations, describing image content and distortions descriptively and comparatively. To achieve this, the authors developed a hierarchical task framework (Quality Description, Quality Comparison, Comparison Reasoning) and created the M-BAPPS dataset, which includes detailed and brief text descriptions for image quality evaluation. The research demonstrates that DepictQA outperforms score-based methods and general MLLMs in aligning with human judgment, especially in complex scenarios involving image misalignment or multiple distortions, and can even be extended to non-reference applications.
keepSave to notecopy_alldocsAdd noteaudio_magic_eraserAudio OverviewflowchartMind Map