Neural intel Pod

Personalizing Multimodal Models with Yo'Chameleon


Listen Later

This document introduces Yo’Chameleon, a novel method for personalizing Large Multimodal Models (LMMs) like Chameleon. Recognizing that current LMMs lack user-specific knowledge, the paper proposes using only 3-5 images of a novel concept to train the model for both personalized language and vision generation. Yo’Chameleon employs soft-prompt tuning to integrate subject-specific information and utilizes a dual soft prompt architecture with a self-prompting mechanism to handle both understanding and generation tasks effectively. Furthermore, a unique soft-positive training strategy leverages similar negative images to enhance generation quality, demonstrating a significant step towards making LMMs more personalized for real-world applications while avoiding catastrophic forgetting of general abilities.

...more
View all episodesView all episodes
Download on the App Store

Neural intel PodBy Neural Intelligence Network