LEARN IMAGE SEGMENTATION: Modern Deep Learning Approaches for Self-Driving Cars

Get your hands dirty coding modern deep learning blocks and learn to build state-of-the-art image segmentation models for robotics, healthcare, and more...

How to Build State-of-The-Art Image Segmentation Models, and understand any new Deep Learning Architecture!

To the attention of all aspiring Computer Vision Experts:
If you feel overwhelmed every time a new Deep Learning architecture gets released, and want to build strong image segmentation skills you can reuse in many different fields, then read on:

The v2 of my course "IMAGE SEGMENTATION" just got released, and it's going to teach image segmentation in a completely new way that will also highly benefit your Deep Learning skills.
With image segmentation, you'll see an incredible number of different applications. Such as:
  • Self-Driving Cars and Robotics
  • Healthcare & Medical Imaging
  • Satellite Imaging & Aerial Navigation
  • Background Removal / Blurring
  • and many more...
But this isn't the only reason Computer Vision Engineers should learn it:

If you're looking at how Computer Vision Engineers go from 0 to experts. You'll find that there are 3 main levels at which they may be plateauing: Deep Learning, 3D, and Time. 

And most engineers stay stuck at the "Deep Learning level".

Not because they don't understand it.
But because they're overwhelmed every time a new architecture gets released.
Why does the overwhelm comes?
This is our first point.

The second is: most of the Deep Learning knowledge can be acquired by learning how to build, train, and run image segmentation architectures. A very high number of segmentation architectures are now repurposed for other Deep learning and Computer Vision tasks.

There are ways to understand a Deep Learning architecture without going from scratch every time.

Yes, when a new architecture comes, you can watch videos, you can try tutorials, or you can spend hours reading the paper until you get it. And when I was doing the v1, this is what I was doing too...

But when you think about it, most research papers are not reinventing the wheel from scratch every time. They simply combine existing architectures and "blocks" in a better and more efficient way.  

For example:

if you look at the DeepLabv3+ architecture, which is really good architecture for image segmentation that we'll study in the course, you'll find that it's a combination of 3 blocks: Auto-Encoders, Spatial Pyramid Pooling, and Atrous Convolution modules.

And this is how it works for nearly all models!

These 3 "blocks" may scare you right now, but if you start analyzing image segmentation architectures, you'll find that they're oddly common!

For example, the Auto-Encoder is present in nearly all image segmentation architectures.

And this is how most Deep Learning architectures work:

They take several existing blocks, sometimes modify them a bit, and combine them into something new!

See it like a game of LEGO.
You have many blocks at first, but if you know how to assemble some blocks together, you can build just about any set, like Hogwarts, or an AT-AT!
Expert LEGO players don't learn figures, they learn the game!

Deep Learning is like a game of LEGO. Once you know how to assemble the blocks together, you can build just about any figure!



This is why, and just like with LEGOs, rather than trying to learn a new architecture, try to learn how to engineer the most used "blocks" in Deep Learning.

You did it with activation functions.
You learned how the most used activation functions worked, and you now understand the part of any paper that talks about "ReLU" or "Sigmoid".

You did it with Convolutions too.
If you've learned all about Convolutions, you have learned about "strides", and "pooling", or maybe even "residual blocks", and you now understand the part of any paper that talks about stacking layers of convolutions with a stride of 2.

It's time to do it at the larger scale, with blocks.

If you understand how to combine the blocks, you can understand any new architectures that uses these blocks much easier.


Rather than learning the entire paper every time, deconstruct it, and learn what makes 80% of the paper!

You can be on your way to build perception systems for autonomous robots, design cancer detection algorithms, assist the army with aerial segmentation tasks, or even build your own background removal software.

Deep Learning has evolved a lot over the years. In fact, since I released the v1 of this course, an incredible amount of new architectures has been published. 

But it's always about assembling blocks together.

So let's see the program of the course!

INTRODUCING...

LEARN IMAGE SEGMENTATION: Modern Deep Learning Techniques for Computer Vision Experts

Learn Image Segmentation by developing an extensive understanding of Deep Learning blocks.





MODULE 1

Intro to Image Segmentation

In the first module, we'll begin by understanding how to use image segmentation in self-driving cars, and we'll then learn about Auto-Encoders, the core architecture used in image segmentation.

You'll learn:

  • How I would explain Convolutions to a new Deep Learning Engineer (this course has LOTS of refreshers on Convolutions, so you're not jumping into deep water)
  • The main types of advanced Convolutions you should know including 2D Convolutions, Atrous Convolutions, Deconvolutions, Transposed Convolutions, Dilated Convolutions, and more...
  • 5 "Non-Neural Networks" Ways to do image segmentation (a very good way to understand the future is to understand the past)
  • A simple formula to calculate the output shape of a convolutional layer (no matter the stride, padding, or customization added to it)
  • How pixelwise classification creates a segmentation mask for 1, 2, 19, or 80 classes.
  • When to use Residual Connections and when not to use them
  • The only type of layer you should ever use as final layer of your image segmentation architecture
  • The difference between 1x1 Convolutions and Pooling Modules (often, these 2 ideas are used one after the other, we'll learn how and why)
  • What is the "latent space representation" of a network, and what you can do with it
  • 5 Ways to "Upsample" a compressed representation to the shape of your input image (including the most used nowadays: bilinear upsampling, but also max unpooling,  or transposed convolutions)
  • The loss and metrics of image segmentation networks (this was a requested update from the v1 of the course, and it is now available, we'll tackle IOU metrics, but also the popular losses such as Cross-Entropy, Focal Loss, or Dice Loss)
  • A 4-Step Technique to drive a car autonomously using only an image segmentation mask as input.
Wait a minute.
Let's pause on this one.

Probably 100% of the teachers out there teaching image segmentation will give the example of "self-driving cars".  

But does anyone talk about what to do with a segmentation mask?

This question is much harder to answer, because it demands an understanding of the main system you're using image segmentation for. This is why we'll see how to use image segmentation masks in the context of self-driving cars. 

We'll see how to use it in the Sensor Fusion module, but we'll also study novel architectures such as Bird-Eye-View Segmentation.

With that, the first module also has:
  • 19 types of outputs an Image Segmentation algorithm can give you in the context of autonomous driving
  • 3 examples of Image Segmentation algorithms implemented by cutting-edge startups
  • The difference between an Encoder and a Decoder, and how to repurpose an Encoder for different use cases
  • An exclusive look into an Image Segmentation Engineer Job Offer (we'll look at a job offer in the medical field, and I'll give you my inputs on what I would do before applying to it)
  • and many more...

MODULE 2

Modern Deep Learning

The second module is purely "architecture". Once we know how to build autoencoders that can recreate images, classify every single pixel, we'll implement our first AutoEncoder.

What you'll learn:

  • Inside a UNet Architecture — the pillar of Deep Learning for image segmentation networks
  • How UNet, FCN, PSPNet, and DeepLab architectures work (we'll break down the papers of the most common architectures, and see which "block" we can extract from them)
  • How Pyramid Pooling, Pyramid Parsing, or Pyramid Feature Learning works (and how to implement it)
  • What are Conditional Random Fields (an idea in a DeepLab paper), and where to learn about it
  • The difference between DeepLabv3 and DeepLabv3+, and the "block" that makes DeepLabv3+ state-of-the-art.
  • Why most label images from the datasets you collect appear "all black", and what to do to recreate a good-looking color.
  • How to implement the Double Convolution Block, the essence of a UNet Architecture
  • The "All Natural" implementation of an Auto-Encoder (we'll first build the encoder, then the decoder, and finally add "residual connections" between the two)
  • How to train and run a UNet architecture with PyTorch

Let's take a break.
Although this module explores lots of architecture, the first implementation you'll do is on the UNet module.

You probably heard of UNet before. It's a neural network that has a U-Shape and that has been popularized in 2016 when segmentation first became popular. If you understand the building blocks of this network, you have a first layer to then tackle the rest, and this is what we'll do.

You'll actually build your own UNet Architecture, and train it for a drivable area segmentation project, just like this one:


Module 2 Included Bonus:

The PyTorch Cookbook!

Module 2 is about implementing a UNet with PyTorch. But you may not be used to PyTorch. And yes, it can be a blocking point!  

If you'd like to understand how to learn the fundamentals of PyTorch...
... you'll get access to this bonus 15-page PDF I created just for the occasion to share the 3 most important PyTorch concepts to get started:
  • How to get rid of the confusion from Datasets & DataLoaders
  • The PyTorch way to define a model, and why it can actually be easier to use than Keras. (we'll see how to use the _init_() and forward() functions)
  • What to know before training and running a model with PyTorch.
During the entire book, you'll get access to "minimal codes" of the things you must include at every step.

After module 2 is done, comes the cutting-edge...

MODULE 3

Cutting-Edge Segmentation

This third module tackles entirely new topics and focuses on "blocks".
Once you know how segmentation works, and after you've built your first auto-encoder, you'll implement more advanced modules!

You'll learn:

  • How PSPNet Works, and how to leverage pyramid networks with AutoEncoders
  • The difference between scene parsing and drivable area detection (and how to get rid of "bully researchers" who intimidate engineers with fancy words)
  • What happens if you combine a UNet and a PSPNet architecture
  • How to code a Pyramid Pooling Module and "insert" it into any architecture
  • 2 Deep Learning "Blocks" Playgrounds to experiment with the most popular blocks and shapes (we'll see Atrous Convolutional Blocks, Pyramid Pooling Modules, and more...)
  • Why some networks use different architectures when training than when running, and how to implement your own "auxiliary branches" to make training better.
  • PSPNet Project — Build and Train a PSPNet algorithm from scratch
  • DeepLab Project — Build and Train a DeepLab algorithm from scratch
The power of this module is that you won't build 1, or 2, but 3 complete architectures in total! And by coding 3 different algorithms, you'll have the chance to see some modules being repeated, and develop a good understanding of how to copy and paste blocks from one project to another to build just about any architecture.

And this isn't over yet!
Right now, let's answer some questions about these 3 modules, and right after, you'll see a new "lost" module that is even more cutting-edge...

AN EDGENEER'S TESTIMONIAL

"I was able to apply this to my own work, building autonomous golf cars in Mauritius"

"I am Yudhisteer from Mauritius and I have a background in Industrial engineering. I wanted to know more about Self-Driving cars and was searching for courses. That's when I discovered Jeremy's courses. Someone posted a video of an Image Segmentation algo on roads in Paris and I was completely baffled by it. It was based on a paper dated only a few years back and it was impossible to find a tutorial online to explain how to apply it. I enrolled on that day itself!

The good news is that I am not only taking Jeremy's courses for knowledge but I am also able to apply it in my work. Working in an industry with a lot of golf-cars around, I thought why not automate them.  With Jeremy's guidance, I successfully taught my golf-car its driveable lane and the adjacent one. Truth be told, I planned on building the first autonomous car in Mauritius and perhaps in South Africa. Years ago this idea would have seemed far-fetched but now, I believe it is just some clicks away."

Frequently asked questions

What are the prerequisites to follow the course?

I want to think of this course as the most accessible Computer Vision course on my platform. So if you understand backpropagation, if you're good with Python, and if you've already done other projects such as image classification, you're good to go!

How long is the course?

All my courses are developed to be as short as possible. So is this one! I don't want you hanging there 80 hours getting lost in blocks. So we'll cut down to the essential.
  • If you already know how image segmentation works, but need to learn more about Deep Learning blocks and advanced implementations, it should take you 3-5 hours
  • If you have never done any image segmentation before and gave no clue what an autoencoder is, expect 8-10 hours before you're confident with it.

What type of job can I get with this?

In the course, we spend some time studying a job offer for Deep Learning Researcher in a healthcare startup in Paris. In this job, you'll build novel architectures to find fractures in X-Rays using Deep Learning. Cool job, right?

There are many other fields that use image segmentation, but you can consider that most Computer Vision jobs will be interested with this skill. And as listed above, image segmentation is something you can take to the healthcare field, construction field, to the army, or even when consulting for big groups.

Who is this course for and not for?

Well, first, it's definitely not for you if you're not already a bit used to Deep Learning. 

❌ If you have never trained an image classification network, or even object detection, you should do that first.

❌ Then, if you absolutely want to learn one architecture, this course won't be for you. It's all about combining multiple blocks to build an infinite number of architectures.

✅ Put differently, this course is for those who want to learn about the game of Deep Learning, rather than one single algorithm.

✅ It's for those who are motivated to build cutting-edge segmentation skills, that can be applied in multiple industries.

✅ And it's for those who want to understand how to use segmentation in the self-driving car space. For those who wonder "what's after the segmentation output?"

If you're in either one of the 3 cases, then it is for you.

How far does the course get, technically?

I'm glad you asked.
This course you've just seen is the "core" version of Image Segmentation. But I wanted those who really want to master the latest research papers to have a module on a new type of block named Transformers.

But let's face it. Transformers are such a new and massive that I can't explain it as one part of a course. Not even one big module. It needs to be a course of its own.

This is why I built another course, and I'm plugging it as an exploration (DLC) right after this segmentation course.

It has been built to be followed right after the image segmentation course, let's take a look.

DLC

THE SEGFORMERS EXPLORATION

A Breakthrough Dive in the world of Attention, Transformer Networks, and Vision Transformers applied to Image Segmentation.

This DLC is the first course that teaches "The Attention Mechanism" to Computer Vision people

In the recent years, a new Deep Learning block called a Transformer completely revolutionized the world of NLP and Computer Vision. And that always intrigued me. People even say these are means to completely replace CNNs!

What's included:
  • An introduction to the Attention Mechanism and Spatial Transformers: 2 of the biggest revolutions in Computer Vision...
  • What is an Attention Mask, and the main difference with Convolutions Feature Maps (this can change Deep Learning forever)
  • How Transformers work; and the 3 main differences between Transformers in NLP and Transformers in Computer Vision
  • A step by step walkthrough of Query, Keys, and Values + A Workshop Example to visualize the query, keys, and values!
  • 5 Common Questions about Attention and Transformers visually answered (this will explain self-attention, cross-attention, masked-attention, and 2 other questions)
  • How to code a Multi-Head Attention Network and Visualize what each head is focusing on (you got it, we'll be very, very visual in this course, so you will get to see all Attention heads at every stage of the network)
Let's see what this is about:

But it doesn't stop there, we'll also learn:
  • What is a Segformer Network, and a breakdown of the Encoder and Decoder parts (with a research paper breakdown)
  • A detailed explanation of the main blocks used in Segformers; including Efficient Self-Attention, Overlap Patch Embeddings, Mix FeedForward Networks, and more...
  • Research Paper Breakdown: We'll go over 3 Research Papers: Attention is All You Need (Transformers), An Image is Worth 16x16 Words (Vision Transformers), and SegFormers.
  • How to stack several Transformer Blocks together (and the difference between blocks and stages in Recurrence Learning)
  • A "from scratch" workshop on Segformers to solve the full-frame image segmentation problem.
  • Drivable Area Segformentation: The SegFormer Workshop has been adapted to the Drivable Area Detection workshop of the Image Segmentation course, and you'll get access to the code to compare the 2 problems.
  • and many more...

PICK YOUR ADVENTURE

What will your Image Segmentation Journey look like?

*For all courses, 2-time payment plans are available at no cost

MODERN

IMAGE SEGMENTATION

✔ LEARN IMAGE SEGMENTATION

— 
299€

BLEEDING-EDGE

SEGFORMERS EDITION

✔️ IMAGE SEGMENTATION (299€)
✔️ SEGFORMERS DLC (199€)
— 
399€

Walter Mayor, Image Segmentation Edgeneer

"The price could have stopped me from buying the course, but it didn’t! I got in and found a good content that was well designed."

I like how jeremy explain some concepts. The courses explain well the content and it is not easy to find in another webpages. I also like the pipeline courses, and that Jeremy creates courses with top industry concepts.

Yes, the price can be high, but if you have the money the course is well explained and it have good content (taking in to account that I am from south america)

Something to add: I like how jeremy create the stories and connect with the concepts.

Jeremy just creates top content during all the year!

Kdahan, Image Segmentation Edgeneer

"Better than Udacity, Coursera, and edX."

"I currently engaged in image segmentation and SegTransformer course.  What can I say about your course?  It’s just an awesome coursen, better than udacity, Coursera and edX.

Before taking the course, I had some doubts about whether I could make the career transition from automation to computer vision.

However, after the knowledge from your course, I have gained great confidence in the field of computer vision."

AN EDGENEER'S TESTIMONIAL

"I followed the Segmentation course. It's well structured, and we have every resource necessary to get results and even adapt it to other problems."

Ton Quang Toai, grandfather of an Image Segmentation Edgeneer

"He completed the course and got an internship at Bosch company."

"After graduating from college with a data science degree, he [my grandson] couldn't find a job. So I bought your courses for him to learn to further improve his ability. He completed "IMAGE SEGMENTATION" course and then he got an internship at Bosch company. Thanks to the " IMAGE SEGMENTATION" course, he is doing well on a project about segmenting photos of company equipment."

PICK YOUR ADVENTURE

Enroll in Image Segmentation v2

*2 time payment plans are available at no cost for all plans

MODERN

IMAGE SEGMENTATION

✔️ IMAGE SEGMENTATION



299€

BLEEDING-EDGE

SEGFORMERS EDITION

✔️ IMAGE SEGMENTATION (299€)
✔️ SEGFORMERS DLC (199€)

399€