• €199

Segformers Exploration

  • Closed

Understand the Attention Mechanism, Transformers, and Vision Transformers applied to Image Segmentation by going through an intuitive approach designed for Computer Vision Engineers.

* This course is an add-on to the segmentation course and cannot be purchased independently. Make sure you go through the image segmentation course first, or pass the quiz on this page's Q&A to get access.

Transformer Networks, Explained to Computer Vision brains! 🧠

Breakthrough Attention!

If you've been looking at all the recent models in Deep Learning, you'll notice one "Deep Learning Block" used everywhere... Attention!

This, and especially the "Transformer" version, have made breakthroughs in Deep Learning
with BERT in Natural Language Processing, DALL-E, which is an image generation algorithm that created the vignette of this course, but also self-driving car models such as Tesla's HydraNets, or Occupancy Networks.

It's even in the Tesla Optimus, the Humanoid Robot introduced at Tesla AI Day 2022:


Attention is so important I listed in an email how it's the 3rd most cited concept in Deep Learning history, with 18k quotations, just before regularization (27k papers), and Activation Functions (24k papers).
It's so cited that Scaled-Dot Product Attention (the basic Attention module) is quoted in more papers than the Adam Optimizer, ReLU, or even Sigmoid!

So what is Attention, and what are Transformers?

Attention is the simple idea that we humans don't scan every single pixel in our field of view, like a convolution would, but we focus on specific things.

These things can be objects, moving objects, or anything dangerous coming at us.

Attention means focus. And there is also Multi-Head Attention, that means focus on several parts.

And this is how Neural Networks should learn! Not by analyzing everything, but by learning what's to focus on, and what's not.

For example, take this video of self-driving car startup Wayve, where they build an "Attention Mask" that shows the vehicle 'Attention Map' switching with the traffic light:

The network focuses on the traffic light when it's read, and on the street as soon as it's green. This is Attention!

And this is what we'll learn to build in this course.

Attention Maps, but more importantly: Neural Networks that uses the Attention Block to implement cutting-edge image segmentation architectures!

But have you ever tried to learn attention?

For example, do you recognize the following scenes?
  • "I've tried learning attention a few times already, but never really managed to commit".
  • "I've watched a few tutorials, but I still feel like I don't know how to include this as part of my work"
  • "Many of the breakthrough in Deep Learning mention Transformers and Attention, but I still have no clue how to learn that easily"
Yes, Attention is hard to learn, and there is a good reason...

Most of the explanations available are targeted to NLP people., not us!

All the examples, analogies, and projects are designed for NLP applications like text translation, sentence prediction and others...

Transformers started with NLP, but it's now taking over the Computer Vision world.
And we're not NLP people, we are Computer Vision people.

We don't want to deal with sentences. Words converted to vectors. LSTMs and RNNs. We want images! Yes, images!

And it's even deeper than this:

We don't just want images.
If, like me, you started to learn Deep Learning with CNNs, your brain made its connections to learn.
For years.
​Which means... You are hardwired to learn by looking at images!

Which is why, after months spent learning Attention and Transformers, and even hiring experts to understand how to build cutting-edge architectures that use Transformers for problems like object detection or image segmentation...
... I have compiled the first ever course on Transformers, but explained to Computer Vision people!

Which means that this course won't teach attention with sentences and LSTMs, but with images!

And with images and colors, your brain will make the connections easier.

So let's see the program:

Part I: Attention & Transformer Networks

Introduction to Attention

An introduction to the Attention Mechanism and Spatial Transformers: 2 of the biggest revolutions in Computer Vision...

Attention: Visually Explained

What is an Attention Mask, and the main difference with Convolutions Feature Maps (this can change Deep Learning forever)

Transformers

How Transformers work; and the 3 main differences between Transformers in NLP and Transformers in Computer Vision

Query, Keys, and Values

A step by step walkthrough of Query, Keys, and Values + A Workshop Example to visualize the query, keys, and values!

The Transformers Q&A

5 Common Questions about Attention and Transformers visually answered (this will explain self-attention, cross-attention, masked-attention, and 2 other questions)

And many more, including:

How to visualize the attention maps of any image without any code; 8 Common Types of Attention used in the field; Why CNNs may survive the Attention revolution; and more...

Part II: Implementing a SegFormer

SegFormers

What is a Segformer Network, and a breakdown of the Encoder and Decoder parts (with a research paper breakdown)

SegFormer Blocks

A detailed explanation of the main blocks used in Segformers; including Efficient Self-Attention, Overlap Patch Embeddings, Mix FeedForward Networks, and more...

Multi-Head Visualization

How to code a Multi-Head Attention Network and Visualize what each head is focusing on (you got it, we'll be very, very visual in this course, so you will get to see all Attention heads at every stage of the network)

✨ Research Paper Breakdown

Bonus: We'll go over 3 Research Papers: Attention is All You Need (Transformers), An Image is Worth 16x16 Words (Vision Transformers), and SegFormers.

Block Assembling & Stages

How to stack several Transformer Blocks together (and the difference between blocks and stages in Recurrence Learning)

Segformers From Scratch

A "from scratch" workshop on Segformers to solve the full-frame image segmentation problem.

Drivable Area Segformentation

The SegFormer Workshop has been adapted to the Drivable Area Detection workshop of the Image Segmentation course, and you'll get access to the code to compare the 2 problems.

And many more, including:

How Concatenated Decoders work in Segformers, The exact shape you need to have after a transformer network, A surprising library to reshape transformer blocks without numpy; and more...

Pushkar Raj Singh, Computer Vision Researcher at Samsung

"I'm just about to finish the Segformers course. It has been amazing! "

Great content, to the point and very easy to understand.

I work as a Computer Vision Researcher at Samsung. Transformers have recently started becoming really important from an innovation perspective. Honestly, I was finding it so difficult to learn about transformers from all the NLP examples online.

Very grateful that you released the course! I've gotten a much deeper understanding since starting the course and I'm able to understand the recently published research papers concerning vision transformers much much better now!

Akshata Pore, MS in Robotics Engineering at Worcester Polytechnic Institute

"The design of videos or course content was very elegant"

"I wasn’t sure whether to enroll first because of the cost of the course. But I joined anyway, and I was able to get a brief understanding of transformers in a very simple language. It built up my base to learn more other state of art transformers architecture in future.

The videos were short and concise and informative. Each aspect of transformers was explained in very simple words such as keys, query and values.

The design of videos or course content was very elegant, it followed a structured flow of the content.

I would recommend this course because because it paves a way to learn state of art models in computer vision"

Alessandro Lamberti, Machine Learning Engineer @ NTT DATA Italia

"Very very well done"

The segmentation + segformers courses are very very well done, I'm actually finishing up the latter! 

The pricing was originally an obstacle, but despite that, when the quality is high I don't mind paying more.

Now, as a result of buying the course, I got an actual understanding of the Attention mechanism and an overview of the Transformers world, including Segformers.

What I liked the most out of the course was the workshops! And then the drawings, you going over the different operations inside of the architectures.

I believe it's necessary for an actual understanding.

I highly recommend it if someone can afford it, and already has hands-on and theoretical  knowledge/ experience in Deep Learning.

Frequently asked questions

How long is the exploration and what do I get after it?

This exploration is the DLC of the image segmentation course. But make no mistake, it isn't short.

In fact, it's going to be as long, if not even longer than the image segmentation course, which is evaluated between 4 and 8 hours.

The topics are all new, the implementation may be difficult; and even though everything has been made super easy and intuitive, it can take time for you to "get it".

After completing this course, you'll be a pioneer in the transformer world, and know more about Transformers than 90% of the Deep Learning population.

Can I take the course if I already have Image Segmentation skills, but not the course?

No.

Unless you're able to... solve a quiz!

We have developed a quiz, and if you can get 80% good answers or more, then it would prove you have the skills required.
 
💯 Take the quiz here: https://segmentationquiz.thinkautonomous.ai
If you manage to solve it, then the SegFormer Exploration doors will openjust for you, and you'll have an express ticket to join it!

Do I have lifetime access?

Yes, all courses come with lifetime access, and we make sure to always keep our courses up to date.

Omar Abubakr, Mathematicien | Machine Learning Research Scientist

"I'd really like to thank Mr. Jeremy for that amazing course and for making me understand attention and transformers."

I've really been trying to understand it for a long time but its hard and complex to understand but thanks to Mr. Jeremy and his end-to-end clear explanation of the architecture and paper that I couldn't find anywhere online.

  • €199

SegFormers Exploration

  • Closed

Understand the Attention Mechanism, Transformers, and Vision Transformers applied to Image Segmentation by going through an intuitive approach designed for Computer Vision Engineers.

* This course is an add-on to the segmentation course and cannot be purchased independently. Make sure you go through the image segmentation course first, or pass the quiz on this page's Q&A to get access.