
Sign up to save your podcasts
Or
ArXiv Computer Vision research for Monday, June 10, 2024.
00:20: DualAD: Disentangling the Dynamic and Static World for End-to-End Driving
01:41: NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks
03:22: Vehicle Vectors and Traffic Patterns from Planet Imagery
04:15: A Guide to Stochastic Optimisation for Large-Scale Inverse Problems
05:37: Cascading Unknown Detection with Known Classification for Open Set Recognition
06:42: Latent Directions: A Simple Pathway to Bias Mitigation in Generative AI
07:57: MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
09:32: UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving
10:15: Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models
11:47: Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
13:12: Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations
15:01: FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
16:18: STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
17:53: Hybrid Video Anomaly Detection for Anomalous Scenarios in Autonomous Driving
18:35: Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
20:24: SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
21:48: Spatiotemporal Graph Neural Network Modelling Perfusion MRI
22:57: VCR: Visual Caption Restoration
24:37: AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
26:29: NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
28:09: Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer
30:12: Merlin: A Vision Language Foundation Model for 3D Computed Tomography
32:58: Genomics-guided Representation Learning for Pathologic Pan-cancer Tumor Microenvironment Subtype Prediction
34:26: PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction
36:04: NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
37:28: Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
39:08: GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation
40:52: IllumiNeRF: 3D Relighting without Inverse Rendering
ArXiv Computer Vision research for Monday, June 10, 2024.
00:20: DualAD: Disentangling the Dynamic and Static World for End-to-End Driving
01:41: NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks
03:22: Vehicle Vectors and Traffic Patterns from Planet Imagery
04:15: A Guide to Stochastic Optimisation for Large-Scale Inverse Problems
05:37: Cascading Unknown Detection with Known Classification for Open Set Recognition
06:42: Latent Directions: A Simple Pathway to Bias Mitigation in Generative AI
07:57: MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
09:32: UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving
10:15: Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models
11:47: Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
13:12: Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations
15:01: FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
16:18: STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
17:53: Hybrid Video Anomaly Detection for Anomalous Scenarios in Autonomous Driving
18:35: Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
20:24: SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
21:48: Spatiotemporal Graph Neural Network Modelling Perfusion MRI
22:57: VCR: Visual Caption Restoration
24:37: AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
26:29: NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
28:09: Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer
30:12: Merlin: A Vision Language Foundation Model for 3D Computed Tomography
32:58: Genomics-guided Representation Learning for Pathologic Pan-cancer Tumor Microenvironment Subtype Prediction
34:26: PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction
36:04: NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
37:28: Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
39:08: GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation
40:52: IllumiNeRF: 3D Relighting without Inverse Rendering