June 05, 2024

Ep. 238 - Part 1 - June 4, 2024

40 minutes

ArXiv Computer Vision research for Tuesday, June 04, 2024.

00:20: Plug-and-Play Diffusion Distillation

01:29: Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt

02:33: The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

04:03: Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization

05:38: Choroidal Vessel Segmentation on Indocyanine Green Angiography Images via Human-in-the-Loop Labeling

07:31: 3D Imaging of Complex Specular Surfaces by Fusing Polarimetric and Deflectometric Information

08:47: MetaMixer Is All You Need

10:36: Multi-Scale Direction-Aware Network for Infrared Small Target Detection

12:26: Leveraging Predicate and Triplet Learning for Scene Graph Generation

14:15: OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding

15:57: FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance

17:17: Domain Game: Disentangle Anatomical Feature for Single Domain Generalized Segmentation

18:41: Analyzing the Effect of Combined Degradations on Face Recognition

19:58: UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking

21:42: Analyzing the Feature Extractor Networks for Face Image Synthesis

22:59: Radar Spectra-Language Model for Automotive Scene Parsing

24:17: GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon

25:32: Can CLIP help CLIP in learning 3D?

26:34: Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts

28:19: SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition

29:19: I4VGen: Image as Stepping Stone for Text-to-Video Generation

30:34: PuFace: Defending against Facial Cloaking Attacks for Facial Recognition Models

31:57: M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising

33:37: Image contrast enhancement based on the Schr\"odinger operator spectrum

34:44: Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

35:46: Optimised ProPainter for Video Diminished Reality Inpainting

36:43: Continual Unsupervised Out-of-Distribution Detection

37:55: Progressive Confident Masking Attention Network for Audio-Visual Segmentation

39:06: Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

...more

By Brad Edwards