Steven Data Talk

EP3 Can AI Find Hidden Dangers in Your Code? 🤖💻 LLMs vs. Software Vulnerabilities! (feat. Astrid)(EN)


Listen Later

Steven Data Talk - Can AI Find Hidden Dangers in Your Code? 🤖💻 LLMs vs. Software Vulnerabilities! (feat. Astrid)(EN Dubbed)Ever wondered how cutting-edge AI like Large Language Models (LLMs) are revolutionizing software security? In this episode of Steven Data Talk, we chat with Astrid, an expert researcher from University College London (UCL), specializing in using LLMs to detect code vulnerabilities.Join us as we objectively explore:The New Frontier: How LLM-based vulnerability detection differs from traditional machine learning approaches (like those using NLP or Graph Neural Networks).LLM Advantages: Exploring the potential of zero-shot/few-shot learning, flexibility, and generalization to new, unseen threats.Fine-Tuning is Key: Why pre-trained models often need fine-tuning on specific vulnerability data (often from open-source like GitHub).Current Hurdles: The challenges of accuracy, explainability (critical in security!), handling complex codebases, and the limitations of LLMs with non-source code (like binary or intermediate representations).Hybrid Power: How combining LLMs with traditional program analysis techniques (like program slicing and taint analysis) offers a promising path forward.The Role of AI Agents: Discussing the potential (and current difficulties) of using multi-agent systems for complex code analysis.Real-World Research: Astrid shares insights from her latest work on using LLMs and program slicing for Android malware analysis – a novel approach to tackle huge, complex apps!Whether you're a developer, security professional, AI enthusiast, or just curious about the future of code safety, this discussion offers valuable insights!High-Level Timeline:00:00:00 - Guest Introduction (Astrid) & Research Area (LLMs for Code Vulnerability)00:00:30 - How LLMs Differ from Traditional Methods00:01:44 - Fine-tuning LLMs & Data Sources (Open vs. Closed Source)00:02:51 - Key Challenges: Accuracy & Explainability00:05:06 - LLM Generalization vs. Accuracy Trade-off & New Threats00:06:21 - Overview of Traditional Code Analysis Techniques (ML, GNNs, NLP)00:08:31 - Potential of Multi-Modal & AI Agent Approaches00:10:31 - Combining LLMs with Traditional Program Analysis Tools (Workflow Innovation)00:12:24 - LLM Limitations with Non-Source Code Representations00:13:17 - Guest's Research (Astrid): Android Malware Analysis using LLMs & Program Slicing00:15:01 - Explaining the Program Slicing Technique00:16:21 - Conclusion & Final ThoughtsLike this discussion? 👍 Subscribe for more insights from Steven Data Talk! Let us know your thoughts in the comments below! 👇#LLM #SoftwareSecurity #CodeVulnerability #AI #Cybersecurity #ProgramAnalysis #MachineLearning #DataScience #StevenDataTalk

...more
View all episodesView all episodes
Download on the App Store

Steven Data TalkBy Steven