September 04, 2025

Open Source, Open Problems: What DeepSeek's Safety Gaps Reveal About AI Alignment

12 minutes

Editor’s Note: There was an issue with the podcast, (the podcast skipped ahead to the next article), so I am re-uploading. This podcast is created with Google NotebookLM audio overview. It does a good job with the summary, even if it loves the pronunciation “Dep-Seek.”

The world of large language models (LLMs) is defined by a central tension: how to balance the promise of open-source transparency with the imperative for robust, dependable safety. In their comprehensive study, "Challenges and Applications of Large Language Models: A Comparison of GPT and DeepSeek family of models," researchers Shubham Sharma, Sneha Tuli, and Narendra Badam directly confront this dilemma. By systematically comparing OpenAI’s closed-source GPT-4o with the open-source DeepSeek-V3-0324, they illuminate the trade-offs shaping the current landscape of advanced AI.

Safety and Alignment: Where Models Diverge

Through extensive prompt testing and architectural analysis, the authors outline sixteen critical challenges facing today’s LLMs. Among these, safety and alignment emerge as defining axis of difference. On ethically fraught or potentially divisive questions, GPT-4o consistently chooses caution. For example, when asked to pick "the most peaceful religion," GPT-4o declines to answer directly, emphasizing that peace is a core value across faiths and gently steering the conversation toward shared human values. DeepSeek, by contrast, does not hesitate to select Jainism, furnishing specific doctrinal details and a thoughtful rationale. While factually correct, DeepSeek's willingness to make such judgments illustrates its comparatively porous alignment boundaries. As the authors point out, this can lead to subtle forms of model bias unless carefully managed.

Security Risks and Vulnerability

The divide goes beyond philosophical stance. Citing both direct experiments and third-party audits, the authors reveal that DeepSeek remains significantly more vulnerable to prompt attacks, content evasion, and the unintentional generation of harmful instructions than its closed-source counterpart. Vividly, DeepSeek not only failed all Pliny prompt injection tests but was also markedly less effective at withstanding harmful prompts, hate speech, and WMD-related requests—sometimes producing detailed outputs that a model like GPT-4o would flatly refuse. Hallucination benchmarks reinforce this gap: DeepSeek's error rate is more than twice as high as GPT-4o's, a difference with real-world stakes wherever factuality matters.

The Open-Weight, Closed-Data Paradox

Perhaps the most fascinating aspect of the paper is its analysis of what the authors dub the "open-weight, closed-data dilemma." DeepSeek’s weights are freely available, inviting community modification, audits, and endless customizations. But its raw training data—the ultimate source of in-model knowledge and bias—remains just as inaccessible as that of GPT-4o. Thus, true transparency exists only at the architectural level; the origins of model knowledge are largely beyond scrutiny. Meanwhile, GPT-4o’s black-box approach reduces risk through centralized safety mechanisms, but at the cost of preventing outside researchers from auditing or extending the core model.

Practical Guidance: Choosing a Model for the Real World

Translating these findings into practical recommendations, the authors are clear and nuanced:

* For high-stakes and user-facing scenarios—think healthcare, legal advice, or customer support—GPT-4o is preferred. Its tightly integrated alignment and refusal systems provide a robust safety net that open models, at present, simply cannot match.

* For internal enterprise tools, research, and highly customized deployments, DeepSeek shines. Its flexibility, cost-efficiency (it was trained for just $5–6 million, versus GPT-4o’s estimated $100 million), and open weights make it a superb platform for those able to shoulder the engineering burden of implementing their own safety protocols.

* The report highlights how DeepSeek’s performance in coding, math, and specialized reasoning tasks often rivals that of much larger, more expensive closed models—demonstrating the dividends of Mixture-of-Experts innovations and the open-source ethos.

Scientific and Evaluation Lessons

For researchers, DeepSeek offers an unparalleled environment for reproducibility, experimentation, and architectural transparency. Anyone with adequate hardware and expertise can audit, ablate, or benchmark the released model checkpoint. By contrast, GPT-4o’s results are consistent over short-term API windows but lack the long-term traceability or open experimentation that permanently advances scientific understanding. The paper’s clear-eyed comparison underscores: open models empower the community but shift more alignment and safety responsibility to downstream practitioners.

The Future: Blending Strengths, Not Picking Sides

Throughout, the authors resist simple binaries. They envision a future where open- and closed-source strengths converge: models as safe, aligned, and polished as GPT-4o, yet as transparent and adaptable as DeepSeek. This will require ongoing collaboration, more dynamic and transparent evaluation regimes, and the cross-pollination of innovations from both commercial laboratories and the open research community.

Final Thoughts

In a rapidly maturing field, Sharma, Tuli, and Badam offer a rare, even-handed assessment: closed-source models set the standard for capability and safety, while open-source models drive scientific discovery and democratize powerful AI. Both are indispensable, and the field will thrive where their strengths intertwine.

The full paper "Challenges and Applications of Large Language Models: A Comparison of GPT and DeepSeek family of models" by Shubham Sharma (SunitechAI), Sneha Tuli (Microsoft), and Narendra Badam (Walmart Global Tech) provides comprehensive analysis of 16 key challenges in LLM development and detailed application recommendations. The complete technical report includes architectural comparisons, benchmark results, and practical deployment guidance for practitioners. Read the full paper here.

#AIAlignment #OpenSourceAI #LLMSafety #DeepSeek #GPT4o #MachineLearning #AIEthics #TechPolicy #ArtificialIntelligence #AIResearch #DeepLearningwiththeWolf

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

...more

View all episodes

By Diana Wolf Torres

September 04, 2025

Open Source, Open Problems: What DeepSeek's Safety Gaps Reveal About AI Alignment

12 minutes

Safety and Alignment: Where Models Diverge

Security Risks and Vulnerability

The Open-Weight, Closed-Data Paradox

Practical Guidance: Choosing a Model for the Real World

Translating these findings into practical recommendations, the authors are clear and nuanced:

Scientific and Evaluation Lessons

The Future: Blending Strengths, Not Picking Sides

Final Thoughts

#AIAlignment #OpenSourceAI #LLMSafety #DeepSeek #GPT4o #MachineLearning #AIEthics #TechPolicy #ArtificialIntelligence #AIResearch #DeepLearningwiththeWolf

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit dianawolftorres.substack.com

...more

Share Open Source, Open Problems: What DeepSeek's Safety Gaps Reveal About AI Alignment

Sign up to save your podcasts

Open Source, Open Problems: What DeepSeek's Safety Gaps Reveal About AI Alignment

Open Source, Open Problems: What DeepSeek's Safety Gaps Reveal About AI Alignment