LEARN TRANSFORMERS: Foundation Skills for Intermediate Computer Vision Engineers

199€ or 2* 99€

LEARN TRANSFORMERS: Foundation Skills For Intermediate Computer Vision Engineers

Understand the Attention Mechanism, Transformers, and Vision Transformers applied to Image Segmentation by going through an intuitive approach designed for Computer Vision Engineers.

Buy Now

Transformer Networks, Explained to Computer Vision brains! 🧠

Breakthrough Attention!

If you've been looking at all the recent models in Deep Learning, you'll notice one "Deep Learning Block" used everywhere... Attention!

This, and especially the "Transformer" version, have made breakthroughs in Deep Learning with BERT in Natural Language Processing, DALL-E, which is an image generation algorithm that created the vignette of this course, but also self-driving car models such as Tesla's HydraNets, or Occupancy Networks.

It's even in the Tesla Optimus, the Humanoid Robot introduced at Tesla AI Day 2022:

Attention is so important I listed in an email how it's the 3rd most cited concept in Deep Learning history, with 18k quotations, just before regularization (27k papers), and Activation Functions (24k papers). It's so cited that Scaled-Dot Product Attention (the basic Attention module) is quoted in more papers than the Adam Optimizer, ReLU, or even Sigmoid!

So what is Attention, and what are Transformers?

Attention is the simple idea that we humans don't scan every single pixel in our field of view, like a convolution would, but we focus on specific things.

These things can be objects, moving objects, or anything dangerous coming at us.

Attention means focus. And there is also Multi-Head Attention, that means focus on several parts.

And this is how Neural Networks should learn! Not by analyzing everything, but by learning what's to focus on, and what's not.

For example, take this video of self-driving car startup Wayve, where they build an "Attention Mask" that shows the vehicle 'Attention Map' switching with the traffic light:

The network focuses on the traffic light when it's read, and on the street as soon as it's green. This is Attention!

And this is what we'll learn to build in this course.

Attention Maps, but more importantly: Neural Networks that uses the Attention Block to implement cutting-edge image segmentation architectures!

But have you ever tried to learn attention?

For example, do you recognize the following scenes?

"I've tried learning attention a few times already, but never really managed to commit".
"I've watched a few tutorials, but I still feel like I don't know how to include this as part of my work"
"Many of the breakthrough in Deep Learning mention Transformers and Attention, but I still have no clue how to learn that easily"

Yes, Attention is hard to learn, and there is a good reason...

Most of the explanations available are targeted to NLP people., not us!

All the examples, analogies, and projects are designed for NLP applications like text translation, sentence prediction and others...

Transformers started with NLP, but it's now taking over the Computer Vision world.
And we're not NLP people, we are Computer Vision people.

We don't want to deal with sentences. Words converted to vectors. LSTMs and RNNs. We want images! Yes, images!

And it's even deeper than this:

We don't just want images.
If, like me, you started to learn Deep Learning with CNNs, your brain made its connections to learn.
For years.
Which means... You are hardwired to learn by looking at images!

Which is why, after months spent learning Attention and Transformers, and even hiring experts to understand how to build cutting-edge architectures that use Transformers for problems like object detection or image segmentation...
... I have compiled the first ever course on Transformers, but explained to Computer Vision people!

Which means that this course won't teach attention with sentences and LSTMs, but with images!

And with images and colors, your brain will make the connections easier.

So let's see the program:

What you'll learn:

Introduction to Attention

An introduction to the Attention Mechanism and Spatial Transformers: 2 of the biggest revolutions in Computer Vision...

Attention: Visually Explained

What is an Attention Mask, and the main difference with Convolutions Feature Maps (this can change Deep Learning forever)

Transformers

How Transformers work; and the 3 main differences between Transformers in NLP and Transformers in Computer Vision

Query, Keys, and Values

A step by step walkthrough of Query, Keys, and Values + A Workshop Example to visualize the query, keys, and values!

The Transformers Q&A

5 Common Questions about Attention and Transformers visually answered (this will explain self-attention, cross-attention, masked-attention, and 2 other questions)

Vision Transformers From Scratch

Get a 101 Introduction to Vision Transformers and Implement your first transformer Network & Visualize Attention Maps using Queries

Deformable Attention

An intro to Deformable Attention, and a workshop to get you to Implement your first advanced Transformer using Deformable Attention mechanisms

And many more, including:

How to visualize the attention maps of any image without any code; 8 Common Types of Attention used in the field; Why CNNs may survive the Attention revolution; and more...

Frequently asked questions

How long is the course and what do I get after it?

This course is the spinoff of my Segformer DLC (of the image segmentation course). It's a short course teaching you the fundamentals of Transformer Networks.

If Transformers have always been difficult to you, if you naturally "don't get it", it can take some time for you to grasp the concepts.

After completing this course, you'll have catchup with the transformer world, and know more about Transformers than 90% of the Deep Learning population.

Should I take this course if I already have the Segformers DLC?

The Segformers course has 2 core module:

Intro to Attention & Transformers
Segformers

In this course, we reused the module 1 (introduction), and added a project on general vision transformers. We then created a 2nd module on Efficient Attention and implementation of Deformable Attention Transformers from scratch.

Therefore, since ~50% of the content is the same, you will find a 55% discount in the final section of the Segformers DLC.

Do I have lifetime access?

Yes, all courses come with lifetime access, and we make sure to always keep our courses up to date.

How is this course different from any other Transformer Course I can take?

This course is unique because it teaches Transformer Networks for Computer Vision brains. All our examples are Computer Vision based, and we don't have any NLP workshop.

Other courses are for the most part, for an unknown reason, explaining Transformers in an NLP perspective. They use wikipedia corpus, text translations, english language examples, but rarely Computer Vision examples with images where Query, Key and Values and easily understandable.

Using a Computer Vision only approach is important because this is how we learn best, especially when we've started with Computer Vision. This makes an efficient understanding and quicker learning.

Pushkar Raj Singh, Computer Vision Researcher at Samsung

"I'm just about to finish the Segformers course. It has been amazing! "

Great content, to the point and very easy to understand.

I work as a Computer Vision Researcher at Samsung. Transformers have recently started becoming really important from an innovation perspective. Honestly, I was finding it so difficult to learn about transformers from all the NLP examples online.

Very grateful that you released the course! I've gotten a much deeper understanding since starting the course and I'm able to understand the recently published research papers concerning vision transformers much much better now!

Alessandro Lamberti, Machine Learning Engineer @ NTT DATA Italia

"Very very well done"

The segmentation + segformers courses are very very well done, I'm actually finishing up the latter!

The pricing was originally an obstacle, but despite that, when the quality is high I don't mind paying more.

Now, as a result of buying the course, I got an actual understanding of the Attention mechanism and an overview of the Transformers world, including Segformers.

What I liked the most out of the course was the workshops! And then the drawings, you going over the different operations inside of the architectures.

I believe it's necessary for an actual understanding.

I highly recommend it if someone can afford it, and already has hands-on and theoretical knowledge/ experience in Deep Learning.

Omar Abubakr, Mathematicien | Machine Learning Research Scientist

"I'd really like to thank Mr. Jeremy for that amazing course and for making me understand attention and transformers."

I've really been trying to understand it for a long time but its hard and complex to understand but thanks to Mr. Jeremy and his end-to-end clear explanation of the architecture and paper that I couldn't find anywhere online.

199€ or 2*99€

LEARN TRANSFORMERS: Foundation Skills For Intermediate Computer Vision Engineers

Buy Now