MASTER STEREO VISION: Killer Approaches to Pseudo-LiDARs & 3D Computer Vision

Take your Computer Vision skills to the 3rd dimension and build advanced pseudo-LiDARs.

What kind of Computer Vision Engineer are you?

2D vs Computer Vision Engineers

If you look at the image above, you’ll notice that the engineers on the right seem to be capable of much more than those on the left, when it comes to detecting objects.

And yet, these two images are produced by Computer Vision Engineers, and they were both asked to run “object detection”. But one engineer was doing in it 2D, while the other was doing it in 3D.

Okay, but why would we be interested in 3D?

👉🏼 Back when I was creating the v1 of this Stereo Vision course in late 2020, an engineer from my audience named Matt, contacted me and asked:

"How can I become a Computer Vision Engineer at Tesla?"

This question felt like a trap.
Was he seriously asking me how he could join one of the most advanced and exclusive company in the autonomous tech field?

I needed to find out more, so I asked:

— What can you do in Computer Vision?

— "All the basics, image processing, object detection, segmentation, ..." he replied

And the next question that came into my mind was:
— "Can you work in 3D?"

He nodded, with a bit of confusion.

But why did I ask this question?

A couple of years ago, I wouldn’t have.

But after years of experience as a self-driving car engineer, and helping engineers get their dream job in cutting-edge companies, I learned quite a few things. I also got to talk to lots of self-driving car companies, and identified recurring elements in these.

One of this element is also present in a job offer for Computer Vision Engineer in the Tesla Autopilot Team:

a job offer for Tesla Autopilot Engineer

Did you see it? Yes! While most of the skills are just related to coding, maths or linux, the only Computer Vision skills they’re interested in are 3D Computer Vision skills!

And the list of skills can be terrifying: Intrinsic and Extrinsic Calibration; Homogeneous Coordinates; Projection Matrices; Epipolar Geometry; 3D Reconstruction; Structure From Motion; Visual Odometry…

What’s all of this?

These are the skills that make the difference between the 2D engineers on the left image from before, and the 3D engineers on the right one.

While 2D Engineer don't even know these skills exist, and focus on YOLO YOLO YOLO, 3D Engineers build very advanced projects inside the coolest companies.

I decided to send this job offer to Matt, and highlight the part showing 3D. He seemed uninterested at first, but came back a few days later telling me he'd decided to put that on his list.

Is Matt going to learn 3D Computer Vision and immediately be hired at Tesla? I doubt it. It's unlikely in general to work there. Yet, if Matt built 3D Computer Vision skills, he could work in TONS of startups building 3D Computer Vision projects...

Like:

But do you really need these? Can't you stick to 2D?

Okay, let’s assume we do stick to 2D skills.
And let’s say we’re working as an engineer doing object detection for a self-driving car startup. One of the first things you’ll do is probably to implement the YOLO algorithm, that is doing object detection.

And after a few hours, you’ll have a ready to run model, giving you some information, like for example the information that there is a car pixel (438, 237).

But what is your company gonna do with that?

Eventually, this information will need to be correlated to a distance to our ego car, a direction and orientation, and therefore as a 3D position.

Without 3D, you’ll just be good to run 2D algorithms, and send them to the 3D people, that can by the way also do 2D.

And this is why 3D Engineers are hunted by robotics and computer vision companies... while 2D engineers patiently wait in line.

In fact, for these companies, 2D skills are implied, and sometimes not even listed on resumes or job offers!

If we go further, 3D skills are not only more useful to companies, but also more impressive to recruiters, or in your portfolio.

And it opens more doors:

Imagine you’re playing at the lottery, and you suddenly won a week of vacation at the Royal Atlantis in the Bahamas, which is one of the most beautiful hotel in the island. You can already feel it, the sun on your skin, the luxurious room with a sea view, the shark pool… it’s going to be a great week.

A week later, you check in at the counter after a long flight, and the receptionist asks: “Which room do you want? Our Royal Imperial Suite is available.” That’s almost too good to be true. But hey, you won the game! And so you pick the room and have a fantastic week.

Now, what if the receptionist asked: “Which room do you want? you can have any of our standard room at the ground floor.” Huuh. It doesn’t feel the same. What about the most beautiful suite? Is it taken?

Yes, it is. By the 3D Engineers.

When you’re sticking to 2D, it’s exactly as if you were limiting yourself to only the ground floor. And as soon as you start getting 3D skills, you’re starting to get access to the most luxurious villas of the hotel // the most advanced startups, working on the coolest projects.

Whether you're planning to work on Computer Vision at Tesla, on Apple's Object Capture algorithm, or in the latest autonomous robot and drone startups, they work in 3D.

But how to learn about 3D Computer Vision?

I've seen hundreds of students try to learn about 3D Computer Vision, and suddenly get overwhelmed with a gigantic amount of PDFs from 1993, outdated lectures, and yes, boring content everywhere.

3D Computer Vision is one of the most difficult topic in Computer Vision... and no course was covering it right!

Until now:

Introducing...

MASTER STEREO VISION: Killer Approaches to Pseudo-LiDARs & 3D Computer Vision

For 2D Engineers who want to take their Computer Vison skills to the next level and master the art of 3D.

Let's see the program...

MODULE 1 🆕

Camera & Calibration Foundations

Nearly all Computer Vision Engineers know about image processing, but what about cameras? None of the 3D Computer Vision applications are possible if you don't have a SOLID background in cameras. Hence, our module 1 will focus on cameras, stereo geometry, and calibration.

What you'll learn:

Why ‘2D Engineer’ is becoming a career death sentence (And the 7 skills that got my students hired in Autonomous Vehicle Startups)
Why AV Companies use 5+ cameras for Perception (And How to Design Your Own Multi-Cam System)
“Why 3D Reconstruction Fails When You Treat Cameras as Black Boxes (And the key camera principles you should know)
Master the 7 camera parameters (focal length, distortion coefficients, etc.) to feel in control about your camera knowledge
Why you should never set your camera in "rolling shutter" mode if you're working in high-speed environments
A short analysis of the OAK-D camera's cheatsheet and the exact parameters you need to pay attention to when purchasing a camera for a company
Understand exactly how to fix an image distortion (like the Go-Pro effect) and dive into camera distortion parameters.
How to convert any point in the world into an image pixel, and the geometrical process behind intrinsic and extrinsic camera calibration.
Homogeneous Coordinates Simply Explained: A deep dive into Homogeneous Coordinates and Projections Matrices (some heavy maths are involved, but we'll go step by step without ever skipping a line)
How a wrong focal length can crash an autonomous rover Demo (And the 20% parameters to know to fix 80% mistakes)
Why some companies sometimes use 10+ calibration chessboards to calibrate the same camera (and why you should too if your project gets serious!)
🧑🏻‍💻 Monocular and Stereo Camera Calibration Workshop: Implement Camera Calibration with OpenCV, up to 3D Reconstruction

Okay let's take a break — the upcoming is important —

Most engineers think calibrating a camera is straightforward, and has little impact. Yes, when you're really in the field, you realize it's extremely important to have the right board, and the right calibration in place.

If your calibration is bad, your 3D Reconstruction (and other advanced code) will be impacted:

In the NEW calibration workshop we have revamped from scratch in 2025, you now have the possibility to visualize the impact of your calibration on 3D Reconstruction.

This is going to be much different than usual, calibration is now intuitively explained.
After this, we're gonna have some fun with 3D Computer Vision...

MODULE 2

Stereo Vision & Epipolar Geometry

How to implement Depth Estimation with Stereo Cameras?

In the second module, we'll dive in the core of Stereo Vision: Depth Estimation. We'll learn all about epipolar geometry, triangulation, disparity maps, depth maps, all the way up to 3D object detection...

What you'll learn:

How to get the 3D coordinates of any pixel using just two cameras (this course is about 3D more than anything else)
The dead-simple high school geometry behind Stereo Vision (and why similar triangles can be more useful that you thought)
The only known way to do Monocular Depth Estimation (and 3 examples of algorithms implementing it really well)
What mathematically happens when you add a second camera to your setup (you'll see a lot of similarities with the triangulation using a compass you did in high school)
What is the disparity of a stereo setup and how to convert it into a depth measurement

Let's pause for a second here and explain what I mean by that:

Because you're using 2 cameras, you want to retrieve the position of a pixel in the two respective images. And for that, you're going to look at the same image from two different angles (cameras).

Take, for example, the point in the side mirror of the left and right image below:

Notice how it's visible from the two images, but isn't located at the same position. On the left image, it's at position x = 300, but on the right image, it's at x=250.

This difference of 50 is called the disparity, and this if you can find it super accurately, you're going to get incredible Depth Map results.

Speaking about disparity, you'll also learn:

A stereo calibration trick to accelerate your disparity estimation speed by 90%
How to use Block Matching to compute fast and smooth depth maps.
Local Stereo Matching: How the StereoBM algorithm works, and what's behind most local stereo matching algorithms
The subtle difference between local, semi-global, and global stereo matching (and how to know which algorithm to use based on the problem you're solving)
How to implement the StereoSGBM algorithm and choose the best parameters.
Finally understand the principles behind Epipolar Lines (and how it can be used for 3D Reconstruction)
Deep Stereo Vision: A special look at 5 Deep Learning techniques for Stereo Matching (Warning ⚠️: This is super advanced and requires a good understanding of Deep Learning)
The surprising reason why Face Recognition can be used to build Depth Maps (and an in-depth analysis of a Deep Stereo Vision Algorithm doing it)
SPM-Net: How to use Pyramid Neural Networks to estimate the depth of any stereo pair.
What is a Stacked Hourglass network and the reason why everybody suddenly starts using it in Neural Networks.
Why researchers use 3D CNNs in disparity estimation (recently, 3D CNNs have been proven to be super slow, but they still perform well when included in specific architectures).
The actual algorithm a client of the v1 of this course implemented to run Stereo Vision on mobiles ultra-fast.
What is a multi-stage Deep Learning architecture and the hidden benefits of using it for Disparity Estimation
RAFT-Stereo: A subtle change you can make to convert the RAFT algorithm, doing optical flow estimation, into a Stereo Vision algorithm.

Let's pause here for a second. The RAFT algorithm is something I teach in details in my Optical Flow course, and that allows you to find the "Flow Map" in a pair of two consecutive images.

But if you adapt this algorithm, you can find the "Disparity Map" from a pair of two parallel images.

We're going to take a closer look at the necessary adjustments you can do to have a state-of-the-art algorithm that implements Deep Disparity Estimation.

Finally:

CRE-Stereo: How to run the CRE-Stereo Algorithm on your computer and build ultra-accurate depth maps (bonus: a ready-to-run notebook I implemented to help you run the CRE-Stereo algorithm on any pair of images)

And many more...

MODULE 3

Build Advanced 3D Computer Vision projects

The final part will be about 3D Object Detection, Depth Estimation, and 3D Reconstruction.

In this final module, we'll build a 3D Computer Vision project where you'll implement from scratch a 3D Perception algorithm capable of reconstructing a 3D highway scene with just 2 cameras.

What you'll learn:

How to engineer a Point Cloud reconstruction using just 2 cameras (this technique is used by Tesla to drive without any LiDAR at all!)
5 Steps you need to implement in any 3D Computer Vision project.
The best disparity estimation algorithm to implement for a self-driving car scenario, and how to tweak the parameters.
How to read a vehicle calibration file and how to map your car schematics with actual code.
A simple formula to convert a disparity map into a depth map and get the 3D coordinates of any pixel.
A simple way to run a YOLO object detection algorithm in 3 minutes (you'll then be able to use these 27 lines of code in any project as a black box).

Wait, if you combine the two last bullet points, you actually get something cool:

(2+1)D Object Detection

It's like, 2D Object Detection with Depth... And it looks like this:

This means, once you've been through this project, from any 2D scene, you'll be able to get an accurate depth of the objects! It's not quite exactly "3D" boxes yet, but in the course, I'll also tell you:

What you need to add to find complete 3D Bounding Boxes
An under-used technique to colorize your 3D reconstructed point clouds using image-to-point fusion.
🧑🏻‍💻 3D PERCEPTION PROJECT: How to build a complete 3D Perception pipeline on real self-driving car images.

The course will contain many more concepts. But this last one is really neat. At the end of the course, you'll be able to build a complete 3D Reconstruction project for a highway scene:

And here is the output you'll produce:

So, this is what's taught in the course, but it's not all....

Ashwin Rajendran, Robotics Software Engineer at BeeX

"The course contents are really good"

The cost is one reason which made me think a lot about buying the course. There is often a discount or offer due to which I hesitate to buy the course immediately and wait for the offer.

The course contents are really good, and it helps me to track my progress in my specific field. I was able to identify where I stand in technical knowledge by learning through the course.

I loved the easy learning via Jupyter Notebooks, dividing each section to explain the contents properly. I also loved the flow chart of the methodologies, how Jeremy explained the mathematical concepts before going to the projects, and the linking of one course to another that is helping learn how to solve real-world problems.

I recommend this product, as it helps to gain the necessary skills to be a robotics or computer vision engineer.

If you choose to, you can also add more to your learning, with the incredible Add-Ons!

A year after completing the first version of the Stereo Vision course, I started to receive requests from a tiny part of the students who wanted to go even further.

After all, there was so much about 3D Computer Vision we could still discover; such as working with real cameras, or going to Multi-View Reconstruction. These elements felt like they could help the part of the students who really wanted to build the best 3D Computer Vision profile possible, and this is why I stole a well-known idea from the Video Games industry: the DLCs!

In a video game, you have the standard game, and then you have additional chapters if you want to explore specific topics. These chapters are not essential to everybody, but they can tremendously enrich the experience of those who want to take them.

So, meet my 2 additional Stereo Vision DLCs:

DLC #1

The OAK-D Playbook

A short dive in the OAK-D stereo cameras... from running basic applications to engineering your own embedded Computer Vision system.

Get details

Here's what's included:

How to run 50+ Cutting-Edge Applications including 3D Segmentation, 3D Tracking, 3D Reconstruction, and more...
What you need to know about the DepthAI Library to build rapid experimentation projects on the OAK-D
A link I accidently discovered (and that might not stay up forever) that will tell you where to find the cheapest possible OAK-D camera (75$ only)
The quickest way to run Stereo Vision algorithm on the OAK-D
The details about how VPUs (Vision Processing Units) work and how you can use OpenVINO with the OAK-D to run model at thunderbolt speed
How to train and and export custom neural networks to the OAK-D
How to run the State-Of-The-Art (2022) of Stereo Vision on the OAK-D
How to combine several OAK-D cameras together into a multi-view setup
How to use the OAK-D on Autonomous Robots (a part on ROS will be covered)
and many more...

This bonus will be sold separately, but it will be much cheaper if you purchase it with this course.

DLC #2

3D RECONSTRUCTION

In the Stereo Vision course, you'll learn about 2-View Reconstruction. It's a pretty advanced skill that can take you to insanely interesting jobs.

But if you can do 2-View Reconstruction, then why not trying 3, 4, or 10-View Reconstruction?

This will be the topic that companies like Apple or Tesla implement every day, and that I'm going to cover in this 2 hour exploration of Multi-View 3D Reconstruction.

*reserved to students of the Stereo Vision course

Get details

What's covered:

How 2-View Reconstruction Works, and how to extend it to 3+View Reconstruction (Warning: it's not as simple as doing multiple projections, because a lot of overlap might happen)
A 3 Step-Technique to reconstruct 3D objects from multiple images using disparity estimation and image rectification.
5 Examples of 3D Reconstruction Software (free and paid) on the market (and how they're built)
The subtle difference between Multi-View Stereo, 3D Reconstruction, Structure From Motion, and Visual SLAM...
Where to find multi-view stereo vision datasets
7 different types of topologies you should consider before running any 3D Reconstruction project
How to combine several reconstructed point clouds into a single one, and "polish" your 3D reconstruction output!
The 2 Types of Structure From Motion Algorithms (and which one every company is going after)
A special look into the 8-Point Algorithm; helping recover the relative positions of 2 images from feature matching
Why it's so important to calculate the Essential & Fundamental Matrices, and how to use them to retrieve camera positions from just images (you'll actually be able to look at 2 images, and know exactly the relative position of the cameras that took the images)
A dive into the detailed "from scratch" implementation of a Structure From Motion Algorithm doing 3D Reconstruction (and how to master the essential steps of Structure From Motion in any scenario)
How to learn about the subtle art of Point Cloud Alignment and use it to remove reprojection errors in your outputs..
And a whole lot more...

These DLCs are optional, but can significantly push the limits of what you can do.

They became 3D Computer Vision Engineers...

This is one of the most comprehensive courses to understand the basics of stereo geometry. I was struggling to understand how 3D reconstruction works.

This course helped me to have a deeper understanding of concepts such as epipolar geometry and image formation. I also learned about 3D reconstruction for 2 views, and this will help to dive deeper into topics such as SLAM, Dense reconstruction etc.

Thank you so much for this course.

Neeraj Sujan, Computer Vision Engineer

"This is an awesome coursework and nicely curated. I have learned a lot of new things.

Thanks a lot once again for putting together everything required for stereo vision."

Mayur Waghchoure, AD Software Engineer

"The title of the course was very advanced and it did not disappoint. I have learned computer vision but was looking to upgrade my skills to the next level in CV.

I love the workshops! It only asks you to code the advanced topics, fills all the boring boilerplate, and takes about a day (or a weekend). I feel it's a good mix for people who don't want to spend weeks figuring out concepts in a project.

This course exposed me to an unknown world of 3D computer vision. It forced me to read topics that I would normally resist. and I will say it helps you differentiate yourself in the current job market."

Abhishek Sreesaila, Senior Machine Learning Engineer, EY

Website

Wai-Ting Li Stereo Vision Edgeneer

"I was afraid that the course may not dive deep into the maths behind stereo vision for me to better understand the principle or theory behind stereo vision."

But I was wrong.

The course indeed dive very deep into the maths behind stereo vision. The maths content exceeds my expectation that I even skip a few of them such that I can focus only on the parts that I'm ready to apply in my work.

I liked the written recap under each video. I could easily revisit the main takeaways from each topic, especially revisit the equations for stereo vision-related project in my work.

3 things I liked?

The workshop and project really help me to consolidate the knowledge.
The mindmap recap helps me to understand where I'm currently at in the big picture.
The step-by-step derivation helps me to understand the derivation of equations very well.

I would recommend any computer vision engineer who has no stereo vision experience before, but needs the knowledge to quickly getting started on stereo vision, since the course includes both the principle/theory and hands-on workshop for engineers to understand the basics of stereo vision and prepare for more advanced applications.

I believe this product is a perfect introduction to stereo vision and exploration of deep learning in stereo vision, and not much I could add.

FAQs

How long is the course?

I spent months learning Stereo Vision. It was exhausting, long, and painful.
Why?

It's a lot of maths, and it doesn't always make sense.
The lack of good resources in this field is dismaying. How many more articles are engineers going to write about neural networks with numpy?
Most resources were very long and required to read research papers or to understand terms not even introduced.

> So in this course, I made my best to make it super-easy while giving an in-depth knowledge to those of you who want to go to the edge.

This course isn't like other courses you might see on the topic... it respects your time.

If you've already taken some of my courses, you know that I hate to see people spending 30 to 40 hours on a topic that can be taught in 5 hours.

To me, and I know that to you too, time is precious.

This course should take you between 5 to 8 hours if you're already at-ease with Computer Vision, and a bit more if you choose to explore further.

👉 Again, it's manageable: You can do 3 hours tonight, 3 tomorrow, and wrap it up the day after. In a weekend, you have new skills, a great project, and awesome things to show.

What are the prerequisites to follow the course?

If the course seems advanced, it's because it is! The ONLY WAY you can "Master Stereo Vision" in 7 hours is by already validating the following prerequisites:

Coding in Python
Linear Algebra
Matrix Multiplications (go to matrixmultiplication.xyz if you don't know)
Have previously worked with the OpenCV library
Object Detection using CNNs

Can I follow this course with a real stereo camera?

This course will have an Add-On module (explained just below) that's covering the OAK-D Cameras from Luxonis and OpenCV.

However, it's completely optional and most students just go with the self-driving car data I provide.

Can't I just learn Stereo Vision for free?

If you'd like to try this path, be my guest! But I can already tell you: there is a high probability that learning Stereo Vision for free includes you running around in circles and just spending hours streaming for content only to give up later.

That's how it was for me: WEEKS, if not MONTHS spent trying to understand the maths, to connect the dots between a video I was seeing now, an article I read 3 months ago, and a conference in chinese with automatic subtitles.

Honestly, 3D Computer Vision is probably one of the hardest topic to learn on your own.

You probably already saw 50 times the same OpenCV tutorial. These tutorials are so basic that 90% of the topic isn't covered. And it's normal, it's what tutorials and free courses are here for: giving you one approach to solve the problem.

What you'll do here goes far beyond what anything else I've seen online. In this course, you'll build the strong foundations to become a 3D Computer Vision Engineer. Plus, you'll build a relevant and impressive portfolio.

Do I really have to spend time on matrices and cameras? Can't I jump straight to 3D Reconstruction?

Imagine you are building a rocket supposed to go to Mars. Your rocket has two parts: one that will make it to Mars, and the boosters that will fall down to the ocean right after launch.

This is how rockets work: a part of made to "launch" and the other to explore space.

And so is the world of 3D Computer Vision.

Some skills like camera calibration and matrix projections will be taught only to help you get launched in the 3D world.

3D Computer Vision requires solid fundamentals; like maths, and epipolar geometry, and camera calibration, and computational imaging.

Is there a checklist to help me decide if it's for me or not?

Well, nobody ever asked me this question, but here is a checklist to help you decide anyway:

Don't join if:
❌ You don't have the prerequisites
❌ You're allergic to maths
❌ You're reading this page about 3D but still secretely try to run the latest YOLO algorithm on another tab
❌ You don't really believe you can work in 3D Computer Vision startups
❌ You're looking for quick tips and hacks, and not willing to patiently let the concepts sink in.

On the other hand, this course is for those who:
✅ Have a good understanding of the basic maths listed, can code in Python, and already know a bit about Deep Learning and Computer Vision in 2D.
✅ Are willing to spend time acquiring the core concepts of 3D Computer Vision
✅ Want to build an advanced Computer Vision Engineer profile (and therefore don't target super common jobs)
✅ Aspire to work on 3D Computer Vision projects, such as 3D Perception, 3D Reconstruction, Visual SLAM, Robotics, 3D Tracking, or even Sensor Fusion.
✅ Are looking for a shortcut to get results by the end of the week and avoid spending months stuck understanding maths or geometry.

Do I have lifetime access?

Not only you have lifetime access but you also get frequent updates inside the course. Take this course for example, it has been updated 3 times already since it got launched, and a lot of the content (such as Deep Stereo Vision) keeps being added as the course gets updated.

So yes, have a strong commitment to lifetime access.

Danish Faraaz, new Computer Vision Engineer at Ford (after taking the Stereo Vision course)

"I am almost done with the Master Stereo Vision course and have been loving what I am doing!"

I joined the course because Stereo Vision is a very niche concept that very few people teach; so I would have definitely purchased the course no matter what. The only thing that might have delayed my purchase would have been the price. I am sure this would be the case for students and other low income people. But considering the long term benefits to my portfolio, I would have definitely bought the course.

Like you mentioned, I realized that even with all my computer vision skills, my understanding of how cameras actually work was not that good. Your explanation of the theoretical concepts and coding was really good.

While working on the final project, I still was not sure what I was coding. But then when I got to the end of the notebook and had to code the pipeline, everything fell into place and I was able to clearly understand what each function was doing. So I really liked how you had us work on each function first and then code the pipeline to connect everything.

Some other benefits would be my updated understanding of how cameras work and the need for stereo vision. I also got a little taste of the 3d reconstruction which I have been wanting to learn about for quite some time. Finally, this course has made me want to delve more into other aspects of computer vision such as point clouds, sensor fusion etc. I am sure I would be coming back to purchase more of your courses after I am done completing the PLC masterclass and the OAK-D playbook.

I would definitely recommend this course to any of my juniors if they want to dive deeper into autonomous systems. The fundamentals that I learnt in this course can not be received elsewhere."

PICK YOUR ADVENTURE

Select your 3D Computer Vision Journey

* for all courses, 2-time payment plans are available at no cost

CORE

STEREO VISION

✔️ STEREO VISION

—
299€

Enroll Now

BUILDER

OAK-D EDITION

✔ STEREO VISION (299€)
✔ OAK-D PLAYBOOK (59€)

—
329€

Enroll Now

ADVANCED

3D EDITION

✔️ STEREO VISION (299€)
✔️ OAK-D PLAYBOOK (59€)
✔️ 3D EXPLORATION (159€)
—
379€

Enroll Now