AI Signals From Tomorrow

Large Language Models On The Edge


Listen Later

Send us a text

We discuss the paper "A Review on Edge Large Language Models: Design, Execution, and Applications" (https://arxiv.org/pdf/2410.11845)  which is a survey on the design, execution, and applications of large language models (LLMs) on edge devices. It highlights the challenges of deploying large models with billions of parameters on resource-constrained hardware, including memory constraints and computational demands. The paper explores offline pre-deployment techniques such as quantization, pruning, knowledge distillation, and low-rank approximation to make models more efficient, and online runtime optimizations, covering software-level optimizations, hardware-software co-design, and hardware-level considerations. Finally, it showcases various on-device LLM applications across personal, enterprise, and industrial domains and discusses future research directions and open challenges in this field.

Support the show

...more
View all episodesView all episodes
Download on the App Store

AI Signals From TomorrowBy 1az