
Sign up to save your podcasts
Or
The BlueLM-V-3B, a multimodal large language model (MLLM) designed specifically for mobile devices. The researchers address the challenges of deploying large models on mobile phones, such as limited memory and processing power, by implementing a novel algorithm and system co-design approach. This includes a dynamic resolution scheme that optimizes image processing and a token downsampler that reduces the number of image tokens to improve inference speed. The paper emphasizes BlueLM-V-3B's superior performance compared to other models of similar size and its high deployment efficiency on mobile devices.
The BlueLM-V-3B, a multimodal large language model (MLLM) designed specifically for mobile devices. The researchers address the challenges of deploying large models on mobile phones, such as limited memory and processing power, by implementing a novel algorithm and system co-design approach. This includes a dynamic resolution scheme that optimizes image processing and a token downsampler that reduces the number of image tokens to improve inference speed. The paper emphasizes BlueLM-V-3B's superior performance compared to other models of similar size and its high deployment efficiency on mobile devices.