December 29, 2024

DeepSeek-V3_ A 671B parameter language model

9 minutes

Peter Dawell and Nora Kane talk about DeepSeek-V3, a large language model with 671 billion parameters developed using innovative architectures and training methods. It achieves results comparable to top-of-the-line closed systems while outperforming many open source models. The model is offered on Hugging Face and can be operated locally on various hardware platforms (including AMD and Huawei Ascend) with various frameworks. The documentation includes detailed instructions for local execution and evaluates the performance of the model against various benchmarks. Commercial use is supported.

...more

View all episodes

By Peter Dawell, Nora Kane

December 29, 2024

DeepSeek-V3_ A 671B parameter language model

9 minutes

...more

Share DeepSeek-V3_ A 671B parameter language model

Sign up to save your podcasts

DeepSeek-V3_ A 671B parameter language model

DeepSeek-V3_ A 671B parameter language model