
Sign up to save your podcasts
Or
ArXiv Computer Vision research for Tuesday, June 11, 2024.
00:20: Explaining Representation Learning with Perceptual Components
01:28: Optimal Matrix-Mimetic Tensor Algebras via Variable Projection
03:03: Sparse Bayesian Networks: Efficient Uncertainty Quantification in Medical Image Analysis
04:24: Neural Visibility Field for Uncertainty-Driven Active Mapping
05:21: Triple-domain Feature Learning with Frequency-aware Memory Enhancement for Moving Infrared Small Target Detection
06:55: Stepwise Regression and Pre-trained Edge for Robust Stereo Matching
08:38: Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey
10:08: Dual Thinking and Perceptual Analysis of Deep Learning Models using Human Adversarial Examples
11:10: Generative Lifting of Multiview to 3D from Unknown Pose: Wrapping NeRF inside Diffusion
12:34: RWKV-CLIP: A Robust Vision-Language Representation Learner
14:01: Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation
15:03: Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection
16:40: MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results
18:34: Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models
19:38: LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection
21:04: RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks
22:49: PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving
24:15: EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network
26:25: 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
27:16: DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification
29:09: Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments
31:08: Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology
32:23: CAT: Coordinating Anatomical-Textual Prompts for Multi-Organ and Tumor Segmentation
33:54: RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents
35:17: AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
ArXiv Computer Vision research for Tuesday, June 11, 2024.
00:20: Explaining Representation Learning with Perceptual Components
01:28: Optimal Matrix-Mimetic Tensor Algebras via Variable Projection
03:03: Sparse Bayesian Networks: Efficient Uncertainty Quantification in Medical Image Analysis
04:24: Neural Visibility Field for Uncertainty-Driven Active Mapping
05:21: Triple-domain Feature Learning with Frequency-aware Memory Enhancement for Moving Infrared Small Target Detection
06:55: Stepwise Regression and Pre-trained Edge for Robust Stereo Matching
08:38: Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey
10:08: Dual Thinking and Perceptual Analysis of Deep Learning Models using Human Adversarial Examples
11:10: Generative Lifting of Multiview to 3D from Unknown Pose: Wrapping NeRF inside Diffusion
12:34: RWKV-CLIP: A Robust Vision-Language Representation Learner
14:01: Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation
15:03: Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection
16:40: MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results
18:34: Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models
19:38: LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection
21:04: RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks
22:49: PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving
24:15: EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network
26:25: 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
27:16: DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification
29:09: Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments
31:08: Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology
32:23: CAT: Coordinating Anatomical-Textual Prompts for Multi-Organ and Tumor Segmentation
33:54: RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents
35:17: AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding