a job offer for Tesla Autopilot Engineer
Did you see it? Yes! While most of the skills are just related to coding, maths or linux, the only Computer Vision skills they’re interested in are 3D Computer Vision skills!
And the list of skills can be terrifying: Intrinsic and Extrinsic Calibration; Homogeneous Coordinates; Projection Matrices; Epipolar Geometry; 3D Reconstruction; Structure From Motion; Visual Odometry…
These are the skills that make the difference between the 2D engineers on the left image from before, and the 3D engineers on the right one.
While 2D Engineer don't even know these skills exist, and focus on YOLO YOLO YOLO, 3D Engineers build very advanced projects inside the coolest companies.
I decided to send this job offer to Matt, and highlight the part showing 3D. He seemed uninterested at first, but came back a few days later telling me he'd decided to put that on his list.
Is Matt going to learn 3D Computer Vision and immediately be hired at Tesla? I doubt it. It's unlikely in general to work there. Yet, if Matt built 3D Computer Vision skills, he could work in TONS of startups building 3D Computer Vision projects...
Like:
Introducing...
For 2D Engineers who want to take their Computer Vison skills to the next level and master the art of 3D.
MODULE 1 🆕
Nearly all Computer Vision Engineers know about image processing, but what about cameras? None of the 3D Computer Vision applications are possible if you don't have a SOLID background in cameras. Hence, our module 1 will focus on cameras, stereo geometry, and calibration.
What you'll learn:
Why ‘2D Engineer’ is becoming a career death sentence (And the 7 skills that got my students hired in Autonomous Vehicle Startups)
Why AV Companies use 5+ cameras for Perception (And How to Design Your Own Multi-Cam System)
“Why 3D Reconstruction Fails When You Treat Cameras as Black Boxes (And the key camera principles you should know)
Master the 7 camera parameters (focal length, distortion coefficients, etc.) to feel in control about your camera knowledge
Why you should never set your camera in "rolling shutter" mode if you're working in high-speed environments
A short analysis of the OAK-D camera's cheatsheet and the exact parameters you need to pay attention to when purchasing a camera for a company
Understand exactly how to fix an image distortion (like the Go-Pro effect) and dive into camera distortion parameters.
How to convert any point in the world into an image pixel, and the geometrical process behind intrinsic and extrinsic camera calibration.
Homogeneous Coordinates Simply Explained: A deep dive into Homogeneous Coordinates and Projections Matrices (some heavy maths are involved, but we'll go step by step without ever skipping a line)
How a wrong focal length can crash an autonomous rover Demo (And the 20% parameters to know to fix 80% mistakes)
Why some companies sometimes use 10+ calibration chessboards to calibrate the same camera (and why you should too if your project gets serious!)
🧑🏻💻 Monocular and Stereo Camera Calibration Workshop: Implement Camera Calibration with OpenCV, up to 3D Reconstruction
Okay let's take a break — the upcoming is important —
Most engineers think calibrating a camera is straightforward, and has little impact. Yes, when you're really in the field, you realize it's extremely important to have the right board, and the right calibration in place.
If your calibration is bad, your 3D Reconstruction (and other advanced code) will be impacted:
In the NEW calibration workshop we have revamped from scratch in 2025, you now have the possibility to visualize the impact of your calibration on 3D Reconstruction.
This is going to be much different than usual, calibration is now intuitively explained.
After this, we're gonna have some fun with 3D Computer Vision...
MODULE 2
How to implement Depth Estimation with Stereo Cameras?
What you'll learn:
How to get the 3D coordinates of any pixel using just two cameras (this course is about 3D more than anything else)
The dead-simple high school geometry behind Stereo Vision (and why similar triangles can be more useful that you thought)
The only known way to do Monocular Depth Estimation (and 3 examples of algorithms implementing it really well)
What mathematically happens when you add a second camera to your setup (you'll see a lot of similarities with the triangulation using a compass you did in high school)
What is the disparity of a stereo setup and how to convert it into a depth measurement
Let's pause for a second here and explain what I mean by that:
Because you're using 2 cameras, you want to retrieve the position of a pixel in the two respective images. And for that, you're going to look at the same image from two different angles (cameras).
Take, for example, the point in the side mirror of the left and right image below:
Notice how it's visible from the two images, but isn't located at the same position. On the left image, it's at position x = 300, but on the right image, it's at x=250.
This difference of 50 is called the disparity, and this if you can find it super accurately, you're going to get incredible Depth Map results.
Speaking about disparity, you'll also learn:
A stereo calibration trick to accelerate your disparity estimation speed by 90%
How to use Block Matching to compute fast and smooth depth maps.
Local Stereo Matching: How the StereoBM algorithm works, and what's behind most local stereo matching algorithms
The subtle difference between local, semi-global, and global stereo matching (and how to know which algorithm to use based on the problem you're solving)
How to implement the StereoSGBM algorithm and choose the best parameters.
Finally understand the principles behind Epipolar Lines (and how it can be used for 3D Reconstruction)
Deep Stereo Vision: A special look at 5 Deep Learning techniques for Stereo Matching (Warning ⚠️: This is super advanced and requires a good understanding of Deep Learning)
The surprising reason why Face Recognition can be used to build Depth Maps (and an in-depth analysis of a Deep Stereo Vision Algorithm doing it)
SPM-Net: How to use Pyramid Neural Networks to estimate the depth of any stereo pair.
What is a Stacked Hourglass network and the reason why everybody suddenly starts using it in Neural Networks.
Why researchers use 3D CNNs in disparity estimation (recently, 3D CNNs have been proven to be super slow, but they still perform well when included in specific architectures).
The actual algorithm a client of the v1 of this course implemented to run Stereo Vision on mobiles ultra-fast.
What is a multi-stage Deep Learning architecture and the hidden benefits of using it for Disparity Estimation
RAFT-Stereo: A subtle change you can make to convert the RAFT algorithm, doing optical flow estimation, into a Stereo Vision algorithm.
Let's pause here for a second. The RAFT algorithm is something I teach in details in my Optical Flow course, and that allows you to find the "Flow Map" in a pair of two consecutive images.
But if you adapt this algorithm, you can find the "Disparity Map" from a pair of two parallel images.
We're going to take a closer look at the necessary adjustments you can do to have a state-of-the-art algorithm that implements Deep Disparity Estimation.
Finally:
CRE-Stereo: How to run the CRE-Stereo Algorithm on your computer and build ultra-accurate depth maps (bonus: a ready-to-run notebook I implemented to help you run the CRE-Stereo algorithm on any pair of images)
And many more...
MODULE 3
The final part will be about 3D Object Detection, Depth Estimation, and 3D Reconstruction.
What you'll learn:
How to engineer a Point Cloud reconstruction using just 2 cameras (this technique is used by Tesla to drive without any LiDAR at all!)
5 Steps you need to implement in any 3D Computer Vision project.
The best disparity estimation algorithm to implement for a self-driving car scenario, and how to tweak the parameters.
How to read a vehicle calibration file and how to map your car schematics with actual code.
A simple formula to convert a disparity map into a depth map and get the 3D coordinates of any pixel.
A simple way to run a YOLO object detection algorithm in 3 minutes (you'll then be able to use these 27 lines of code in any project as a black box).
Wait, if you combine the two last bullet points, you actually get something cool:
(2+1)D Object Detection
It's like, 2D Object Detection with Depth... And it looks like this:
This means, once you've been through this project, from any 2D scene, you'll be able to get an accurate depth of the objects! It's not quite exactly "3D" boxes yet, but in the course, I'll also tell you:
What you need to add to find complete 3D Bounding Boxes
An under-used technique to colorize your 3D reconstructed point clouds using image-to-point fusion.
🧑🏻💻 3D PERCEPTION PROJECT: How to build a complete 3D Perception pipeline on real self-driving car images.
The course will contain many more concepts. But this last one is really neat. At the end of the course, you'll be able to build a complete 3D Reconstruction project for a highway scene:
And here is the output you'll produce:
So, this is what's taught in the course, but it's not all....
Ashwin Rajendran, Robotics Software Engineer at BeeX
The cost is one reason which made me think a lot about buying the course. There is often a discount or offer due to which I hesitate to buy the course immediately and wait for the offer.
The course contents are really good, and it helps me to track my progress in my specific field. I was able to identify where I stand in technical knowledge by learning through the course.
I loved the easy learning via Jupyter Notebooks, dividing each section to explain the contents properly. I also loved the flow chart of the methodologies, how Jeremy explained the mathematical concepts before going to the projects, and the linking of one course to another that is helping learn how to solve real-world problems.
I recommend this product, as it helps to gain the necessary skills to be a robotics or computer vision engineer.
A year after completing the first version of the Stereo Vision course, I started to receive requests from a tiny part of the students who wanted to go even further.
After all, there was so much about 3D Computer Vision we could still discover; such as working with real cameras, or going to Multi-View Reconstruction. These elements felt like they could help the part of the students who really wanted to build the best 3D Computer Vision profile possible, and this is why I stole a well-known idea from the Video Games industry: the DLCs!
In a video game, you have the standard game, and then you have additional chapters if you want to explore specific topics. These chapters are not essential to everybody, but they can tremendously enrich the experience of those who want to take them.
So, meet my 2 additional Stereo Vision DLCs:
DLC #1
Here's what's included:
DLC #2
What's covered:
How 2-View Reconstruction Works, and how to extend it to 3+View Reconstruction (Warning: it's not as simple as doing multiple projections, because a lot of overlap might happen)
A 3 Step-Technique to reconstruct 3D objects from multiple images using disparity estimation and image rectification.
5 Examples of 3D Reconstruction Software (free and paid) on the market (and how they're built)
The subtle difference between Multi-View Stereo, 3D Reconstruction, Structure From Motion, and Visual SLAM...
Where to find multi-view stereo vision datasets
7 different types of topologies you should consider before running any 3D Reconstruction project
How to combine several reconstructed point clouds into a single one, and "polish" your 3D reconstruction output!
The 2 Types of Structure From Motion Algorithms (and which one every company is going after)
A special look into the 8-Point Algorithm; helping recover the relative positions of 2 images from feature matching
Why it's so important to calculate the Essential & Fundamental Matrices, and how to use them to retrieve camera positions from just images (you'll actually be able to look at 2 images, and know exactly the relative position of the cameras that took the images)
A dive into the detailed "from scratch" implementation of a Structure From Motion Algorithm doing 3D Reconstruction (and how to master the essential steps of Structure From Motion in any scenario)
How to learn about the subtle art of Point Cloud Alignment and use it to remove reprojection errors in your outputs..
And a whole lot more...
These DLCs are optional, but can significantly push the limits of what you can do.
"The title of the course was very advanced and it did not disappoint. I have learned computer vision but was looking to upgrade my skills to the next level in CV.
I love the workshops! It only asks you to code the advanced topics, fills all the boring boilerplate, and takes about a day (or a weekend). I feel it's a good mix for people who don't want to spend weeks figuring out concepts in a project.
This course exposed me to an unknown world of 3D computer vision. It forced me to read topics that I would normally resist. and I will say it helps you differentiate yourself in the current job market."
Wai-Ting Li Stereo Vision Edgeneer
But I was wrong.
The course indeed dive very deep into the maths behind stereo vision. The maths content exceeds my expectation that I even skip a few of them such that I can focus only on the parts that I'm ready to apply in my work.
I liked the written recap under each video. I could easily revisit the main takeaways from each topic, especially revisit the equations for stereo vision-related project in my work.
3 things I liked?
The workshop and project really help me to consolidate the knowledge.
The mindmap recap helps me to understand where I'm currently at in the big picture.
The step-by-step derivation helps me to understand the derivation of equations very well.
I would recommend any computer vision engineer who has no stereo vision experience before, but needs the knowledge to quickly getting started on stereo vision, since the course includes both the principle/theory and hands-on workshop for engineers to understand the basics of stereo vision and prepare for more advanced applications.
I believe this product is a perfect introduction to stereo vision and exploration of deep learning in stereo vision, and not much I could add.
Not only you have lifetime access but you also get frequent updates inside the course. Take this course for example, it has been updated 3 times already since it got launched, and a lot of the content (such as Deep Stereo Vision) keeps being added as the course gets updated.
So yes, have a strong commitment to lifetime access.
Danish Faraaz, new Computer Vision Engineer at Ford (after taking the Stereo Vision course)
I joined the course because Stereo Vision is a very niche concept that very few people teach; so I would have definitely purchased the course no matter what. The only thing that might have delayed my purchase would have been the price. I am sure this would be the case for students and other low income people. But considering the long term benefits to my portfolio, I would have definitely bought the course.
Like you mentioned, I realized that even with all my computer vision skills, my understanding of how cameras actually work was not that good. Your explanation of the theoretical concepts and coding was really good.
While working on the final project, I still was not sure what I was coding. But then when I got to the end of the notebook and had to code the pipeline, everything fell into place and I was able to clearly understand what each function was doing. So I really liked how you had us work on each function first and then code the pipeline to connect everything.
Some other benefits would be my updated understanding of how cameras work and the need for stereo vision. I also got a little taste of the 3d reconstruction which I have been wanting to learn about for quite some time. Finally, this course has made me want to delve more into other aspects of computer vision such as point clouds, sensor fusion etc. I am sure I would be coming back to purchase more of your courses after I am done completing the PLC masterclass and the OAK-D playbook.
I would definitely recommend this course to any of my juniors if they want to dive deeper into autonomous systems. The fundamentals that I learnt in this course can not be received elsewhere."
PICK YOUR ADVENTURE
* for all courses, 2-time payment plans are available at no cost
CORE
✔️ STEREO VISION
—
299€
BUILDER
✔ STEREO VISION (299€)
✔ OAK-D PLAYBOOK (59€)
—
329€
ADVANCED
✔️ STEREO VISION (299€)
✔️ OAK-D PLAYBOOK (59€)
✔️ 3D EXPLORATION (159€)
—
379€