
Sign up to save your podcasts
Or
ArXiv Computer Vision research for Saturday, June 08, 2024.
00:20: Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid
01:31: 1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation
03:01: Metric Convolutions: A Unifying Theory to Adaptive Convolutions
04:13: Layered Image Vectorization via Semantic Simplification
05:18: Select-Mosaic: Data Augmentation Method for Dense Small Object Scenes
06:31: 3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes
07:51: Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models
09:42: Unsupervised learning of Data-driven Facial Expression Coding System (DFECS) using keypoint tracking
11:36: HDRT: Infrared Capture for HDR Imaging
13:14: Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals
14:49: Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
16:18: Training-Free Robust Interactive Video Object Segmentation
17:49: One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
19:50: A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+
21:04: PAPR in Motion: Seamless Point-level 3D Scene Interpolation
22:25: VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification
23:38: Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
25:24: Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification
26:50: Understanding Inhibition Through Maximally Tense Images
27:52: Can Prompt Modifiers Control Bias? A Comparative Analysis of Text-to-Image Generative Models
29:19: Deep Learning to Predict Glaucoma Progression using Structural Changes in the Eye
30:58: Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
32:32: Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
34:11: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
35:35: Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion
ArXiv Computer Vision research for Saturday, June 08, 2024.
00:20: Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid
01:31: 1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation
03:01: Metric Convolutions: A Unifying Theory to Adaptive Convolutions
04:13: Layered Image Vectorization via Semantic Simplification
05:18: Select-Mosaic: Data Augmentation Method for Dense Small Object Scenes
06:31: 3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes
07:51: Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models
09:42: Unsupervised learning of Data-driven Facial Expression Coding System (DFECS) using keypoint tracking
11:36: HDRT: Infrared Capture for HDR Imaging
13:14: Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals
14:49: Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
16:18: Training-Free Robust Interactive Video Object Segmentation
17:49: One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
19:50: A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+
21:04: PAPR in Motion: Seamless Point-level 3D Scene Interpolation
22:25: VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification
23:38: Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
25:24: Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification
26:50: Understanding Inhibition Through Maximally Tense Images
27:52: Can Prompt Modifiers Control Bias? A Comparative Analysis of Text-to-Image Generative Models
29:19: Deep Learning to Predict Glaucoma Progression using Structural Changes in the Eye
30:58: Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
32:32: Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
34:11: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
35:35: Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion