Computer Vision Decoded

By EveryPoint

A tidal wave of computer vision innovation is quickly having an impact on everyone's lives, but not everyone has the time to sit down and read through a bunch of news articles and learn what it means ... more

5

55 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about Computer Vision Decoded:

How many episodes does Computer Vision Decoded have?

The podcast currently has 17 episodes available.

Computer Vision Decoded episodes:

June 17, 2025Understanding Gaussian Splatting w/ NVIDIA's Ruilong Li
In this episode of Computer Vision Decoded, our hosts Jonathan Stephens and Jared Heinly are joined by Ruilong Li, a researcher at NVIDIA and key contributor to both Nerfstudio and gsplat, to dive deep into 3D Gaussian Splatting. They explore how this relatively new technology works, from the fundamentals of gaussian representations to the optimization process that creates photorealistic 3D scenes. Ruilong explains the technical details behind gaussian splatting, and discusses the development of the popular gsplat library. The conversation covers practical advice for capturing high-quality data, the iterative training process, and how Gaussian splatting compares to other 3D representations like meshes and NeRFs.
Links:
gsplat: https://github.com/nerfstudio-project/gsplat
Nerfstudio: https://docs.nerf.studio/

Follow:
Ruilong on X: https://x.com/ruilong_li
Jared on X: https://x.com/JaredHeinly
Jonathan on X: https://x.com/jonstephens85

This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services. Learn more at https://www.everypoint.io
...more
1h 19min
May 14, 2025Camera Types for 3D Reconstruction Explained
In this episode of Computer Vision Decoded, hosts Jonathan Stephens and Jared Heinly explore the various types of cameras used in computer vision and 3D reconstruction. They discuss the strengths and weaknesses of smartphone cameras, DSLR and mirrorless cameras, action cameras, drones, and specialized cameras like 360, thermal, and event cameras. The conversation emphasizes the importance of understanding camera specifications, metadata, and the impact of different lenses on image quality. The hosts also provide practical advice for beginners in 3D reconstruction, encouraging them to start with the cameras they already own.
Takeaways
Smartphones are versatile and user-friendly for photography.
RAW images preserve more data than JPEGs, aiding in post-processing.
Mirrorless and DSLR cameras offer better low-light performance and lens flexibility.
Drones provide unique perspectives and programmable flight paths for capturing images.
360 cameras allow for quick scene capture but may require additional processing for 3D reconstruction.
Event cameras capture rapid changes in intensity, useful for robotics applications.
Thermal and multispectral cameras are specialized for specific applications, not typically used for 3D reconstruction.
Understanding camera metadata is crucial for effective image processing.
Choosing the right camera depends on the specific needs of the project.
Starting with a smartphone is a low barrier to entry for beginners in 3D reconstruction.
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
1h 16min
April 03, 2025 Understanding 3D Reconstruction with COLMAP
In this episode, Jonathan Stephens and Jared Heinly delve into the intricacies of COLMAP, a powerful tool for 3D reconstruction from images. They discuss the workflow of COLMAP, including feature extraction, correspondence search, incremental reconstruction, and the importance of camera models. The conversation also covers advanced topics like geometric verification, bundle adjustment, and the newer GLOMAP method, which offers a faster alternative to traditional reconstruction techniques. Listeners are encouraged to experiment with COLMAP and learn through hands-on experience.
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
58min
March 04, 2025Tips and Tricks for 3D Reconstruction in Different Environments
In this episode, we discuss practical tips and challenges in 3D reconstruction from images, focusing on various environments such as urban, indoor, and outdoor settings. We explore issues like repetitive structures, lighting conditions, and the impact of reflections and shadows on reconstruction quality. The conversation also touches on the importance of camera motion, lens distortion, and the role of machine learning in enhancing reconstruction processes. Listeners gain insights into optimizing their 3D capture techniques for better results.
Key Takeaways
Repetitive structures can confuse computer vision algorithms.
Lighting conditions greatly affect image quality and reconstruction accuracy.
Wide-angle lenses can help capture more unique features.
Indoor environments present unique challenges like textureless walls.
Aerial imaging requires careful management of lens distortion.
Understanding the application context is crucial for effective 3D reconstruction.
Camera motion should be varied to avoid distortion and drift.
Planning captures based on goals can lead to better results.

This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services. Learn more at https://www.everypoint.io
...more
1h 22min
February 18, 2025Exploring Depth Maps in Computer Vision
In this episode of Computer Vision Decoded, Jonathan Stephens and Jared Heinly explore the concept of depth maps in computer vision. They discuss the basics of depth and depth maps, their applications in smartphones, and the various types of depth maps. The conversation delves into the role of depth maps in photogrammetry and 3D reconstruction, as well as future trends in depth sensing and machine learning. The episode highlights the importance of depth maps in enhancing photography, gaming, and autonomous systems.
Key Takeaways:
Depth maps represent how far away objects are from a sensor.
Smartphones use depth maps for features like portrait mode.
There are multiple types of depth maps, including absolute and relative.
Depth maps are essential in photogrammetry for creating 3D models.
Machine learning is increasingly used for depth estimation.
Depth maps can be generated from various sensors, including LiDAR.
The resolution and baseline of cameras affect depth perception.
Depth maps are used in gaming for rendering and performance optimization.
Sensor fusion combines data from multiple sources for better accuracy.
The future of depth sensing will likely involve more machine learning applications.

Episode Chapters
00:00 Introduction to Depth Maps
00:13 Understanding Depth in Computer Vision
06:52 Applications of Depth Maps in Photography
07:53 Types of Depth Maps Created by Smartphones
08:31 Depth Measurement Techniques
16:00 Machine Learning and Depth Estimation
19:18 Absolute vs Relative Depth Maps
23:14 Disparity Maps and Depth Ordering
26:53 Depth Maps in Graphics and Gaming
31:24 Depth Maps in Photogrammetry
34:12 Utilizing Depth Maps in 3D Reconstruction
37:51 Sensor Fusion and SLAM Technologies
41:31 Future Trends in Depth Sensing
46:37 Innovations in Computational Photography
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services. Learn more at https://www.everypoint.io
...more
58min
February 11, 2025 What's New in 2025 for Computer Vision?
After an 18 month hiatus, we are back! In this episode of Computer Vision Decoded, hosts Jonathan Stephens and Jared Heinly discuss the latest advancements in computer vision technology, personal updates, and insights from the industry. They explore topics such as real-time 3D reconstruction, computer vision research, SLAM, event cameras, and the impact of generative AI on robotics. The conversation highlights the importance of merging traditional techniques with modern machine learning approaches to solve real-world problems effectively.
Chapters
00:00 Intro & Personal Updates
04:36 Real-Time 3D Reconstruction on iPhones
09:40 Advancements in SfM
14:56 Event Cameras
17:39 Neural Networks in 3D Reconstruction
26:30 SLAM and Machine Learning Innovation
29:48 Applications of SLAM in Robotics
34:19 NVIDIA's Cosmos and Physical AI
40:18 Generative AI for Real-World Applications
43:50 The Future of Gaussian Splatting and 3D Reconstruction

This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
51min
September 18, 2023 A Computer Vision Scientist Reacts to the iPhone 15 Announcement
In this episode of Computer Vision Decoded, we are going to dive into our in-house computer vision expert's reaction to the iPhone 15 and iPhone 15 Pro announcement.
We dive into the camera upgrades, decode what a quad sensor means, and even talk about the importance of depth maps.
Episode timeline:
00:00 Intro
02:59 iPhone 15 Overview
05:15 iPhone 15 Main Camera
07:20 Quad Pixel Sensor Explained
15:45 Depth Maps Explained
22:57 iPhone 15 Pro Overview
27:01 iPhone 15 Pro Cameras
32:20 Spatial Video
36:00 A17 Pro Chipset
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
43min
May 05, 2023 OpenMVG Decoded: Pierre Moulon's 10 Year Journey Building Open-Source Software
In this episode of Computer Vision Decoded, we are going to dive into Pierre Moulon's 10 years experience building OpenMVG. We also cover the impact of open-source software in the computer vision industry and everything involved in building your own project. There is a lot to learn here!
Our episode guest, Pierre Moulon, is a computer vision research scientist and creator of OpenMVG - a library for computer-vision scientists and targeted for the Multiple View Geometry community.
The episode follow's Pierre's journey building OpenMVG which he wrote about as an article in his GitHub repository.
Explore OpenMVG on GitHub: https://github.com/openMVG/openMVG
Pierre's article on building OpenMVG: https://github.com/openMVG/openMVG/discussions/2165
Episode timeline:
00:00 Intro
01:00 Pierre Moulon's Background
04:40 What is OpenMVG?
08:43 What is the importance of open-source software for the computer vision community?
12:30 What to look for deciding to use an opensource project
16:27 What is Multi View Geometry?
24:24 What was the biggest challenge building OpenMVG?
31:00 How do you grow a community around an open-source project
38:09 Choosing a licensing model for your open-source project
43:07 Funding and sponsorship for your open-source project
46:46 Building an open-source project for your resume
49:53 How to get started with OpenMVG
Contact:
Follow Pierre Moulon on LinkedIn: https://www.linkedin.com/in/pierre-moulon/
Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
56min
April 21, 2023 Understanding Implicit Neural Representations with Itzik Ben-Shabat
In this episode of Computer Vision Decoded, we are going to dive into implicit neural representations.
We are joined by Itzik Ben-Shabat, a Visiting Research Fellow at the Australian National Universit (ANU) and Technion – Israel Institute of Technology as well as the host of the Talking Paper Podcast.
You will learn a core understanding of implicit neural representations, key concepts and terminology, how it's being used in applications today, and Itzik's research into improving output with limit input data.
Episode timeline:
00:00 Intro
01:23 Overview of what implicit neural representations are
04:08 How INR compares and contrasts with a NeRF
08:17 Why did Itzik pursued this line of research
10:56 What is normalization and what are normals
13:13 Past research people should read to learn about the basics of INR
16:10 What is an implicit representation (without the neural network)
24:27 What is DiGS and what problem with INR does it solve?
35:54 What is OG-I NR and what problem with INR does it solve?
40:43 What software can researchers use to understand INR?
49:15 What information should non-scientists be focused to learn about INR?
Itzik's Website: https://www.itzikbs.com/
Follow Itzik on Twitter: https://twitter.com/sitzikbs
Follow Itzik on LinkedIn: https://www.linkedin.com/in/yizhak-itzik-ben-shabat-67b3b1b7/
Talking Papers Podcast: https://talking.papers.podcast.itzikbs.com/
Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85
Referenced past episode- What is CVPR: https://share.transistor.fm/s/15edb19d
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
56min
March 16, 2023 From 2D to 3D: 4 Ways to Make a 3D Reconstruction from Imagery
In this episode of Computer Vision Decoded, we are going to dive into 4 different ways to 3D reconstruct a scene with images. Our cohost Jared Heinly, a PhD in the computer science specializing in 3D reconstruction from images, will dive into the 4 distinct strategies and discuss the pros and cons of each.
Links to content shared in this episode:
Live SLAM to measure a stockpile with SR Measure: https://srmeasure.com/professional
Jared's notes on the iPhone LiDAR and SLAM: https://everypoint.medium.com/everypoint-gets-hands-on-with-apples-new-lidar-sensor-44eeb38db579
How to capture images for 3D reconstruction: https://youtu.be/AQfRdr_gZ8g
00:00 Intro
01:30 3D Reconstruction from Video
13:48 3D Reconstruction from Images
28:05 3D Reconstruction from Stereo Pairs
38:43 3D Reconstruction from SLAM
Follow Jared Heinly
Twitter: https://twitter.com/JaredHeinly
LinkedIn https://www.linkedin.com/in/jheinly/
Follow Jonathan Stephens
Twitter: https://twitter.com/jonstephens85
LinkedIn: https://www.linkedin.com/in/jonathanstephens/
This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io
...more
55min

FAQs about Computer Vision Decoded:

How many episodes does Computer Vision Decoded have?

The podcast currently has 17 episodes available.

More shows like Computer Vision Decoded

Economist Podcasts by The Economist

Economist Podcasts

4,275 Listeners

The Joe Rogan Experience by Joe Rogan

The Joe Rogan Experience

226,013 Listeners

The Vergecast by The Verge

The Vergecast

3,667 Listeners

Decoder with Nilay Patel by The Verge

Decoder with Nilay Patel

3,143 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

389 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

201 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

462 Listeners