AI Papers Podcast for 04/21/2024
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
Controls to Any Diffusion Model
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through
Direct Preference Optimization
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal
Large Language Models
On Speculative Decoding for Multimodal Large Language Models