
Sign up to save your podcasts
Or
ArXiv Computer Vision research for Tuesday, June 04, 2024.
00:20: FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning
02:06: EUFCC-340K: A Faceted Hierarchical Dataset for Metadata Annotation in GLAM Collections
03:14: Learning to Edit Visual Programs with Self-Supervision
04:15: Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images
06:12: WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections
07:39: Decoupling of neural network calibration measures
08:48: IterMask2: Iterative Unsupervised Anomaly Segmentation via Spatial and Frequency Masking for Brain Lesions in MRI
10:29: CoNav: A Benchmark for Human-Centered Collaborative Navigation
12:05: Generative Active Learning for Long-tailed Instance Segmentation
13:17: RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
14:51: Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
16:23: DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
18:13: Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion
19:59: Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation
21:43: GenS: Generalizable Neural Surface Reconstruction from Multi-View Images
23:20: An Open-Source Tool for Mapping War Destruction at Scale in Ukraine using Sentinel-1 Time Series
24:48: Guiding a Diffusion Model with a Bad Version of Itself
25:59: CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
27:17: V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation
28:50: DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering
30:17: SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition
31:24: Enhancing 2D Representation Learning with a 3D Prior
32:32: Parrot: Multilingual Visual Instruction Tuning
34:32: ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
36:20: Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting
38:20: Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
39:41: Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
41:52: Dreamguider: Improved Training free Diffusion-based Conditional Generation
43:10: VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors
ArXiv Computer Vision research for Tuesday, June 04, 2024.
00:20: FedDr+: Stabilizing Dot-regression with Global Feature Distillation for Federated Learning
02:06: EUFCC-340K: A Faceted Hierarchical Dataset for Metadata Annotation in GLAM Collections
03:14: Learning to Edit Visual Programs with Self-Supervision
04:15: Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images
06:12: WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections
07:39: Decoupling of neural network calibration measures
08:48: IterMask2: Iterative Unsupervised Anomaly Segmentation via Spatial and Frequency Masking for Brain Lesions in MRI
10:29: CoNav: A Benchmark for Human-Centered Collaborative Navigation
12:05: Generative Active Learning for Long-tailed Instance Segmentation
13:17: RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
14:51: Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
16:23: DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
18:13: Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion
19:59: Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation
21:43: GenS: Generalizable Neural Surface Reconstruction from Multi-View Images
23:20: An Open-Source Tool for Mapping War Destruction at Scale in Ukraine using Sentinel-1 Time Series
24:48: Guiding a Diffusion Model with a Bad Version of Itself
25:59: CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
27:17: V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation
28:50: DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering
30:17: SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition
31:24: Enhancing 2D Representation Learning with a 3D Prior
32:32: Parrot: Multilingual Visual Instruction Tuning
34:32: ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
36:20: Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting
38:20: Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
39:41: Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
41:52: Dreamguider: Improved Training free Diffusion-based Conditional Generation
43:10: VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors