arXiv research summaries for Computation Vision and Pattern Recognition from October 24, 2023.
Today's Themes - Fair Warning - LLM-Generated Summary ๐
Image and Video Synthesis, Editing, and Manipulation
Methods such as image inpainting, colorization, style transfer, generating images from text, video editing with text guidance, and synthesizing 3D scenes from images and text.
3D object detection, 3D scene understanding, point cloud segmentation, and inverse rendering of 3D objects from images.Self-supervised and Semi-supervised Learning Techniques
Images, video, and multimodal data. Methods aim to make use of unlabeled data.Object Detection and Recognition Architectures
Including transformer-based models like DETR. Research looks at improving localization, classification, and handling occlusion.Visual Question Answering and Reasoning
Using images, video, and multimodal data with a focus on improving large language models. Techniques aim to reduce bias and hallucination.
Methods for generating visually and semantically diverse image outputs for restoration tasks rather than sampling the posterior. Aims to provide more meaningful diversity.
Using synthetic data for validation and continual learning to improve model robustness, avoid overfitting, and handle domain shift.
Autonomous vehicles, robotics, medical imaging, human action analysis, image privacy and security, biometrics, etc.