Want to keep learning?

This content is taken from the Taipei Medical University's online course, Applications of AI Technology. Join the course to learn more.

Skip to 0 minutes and 15 seconds Researchers found that the deeper architecture usually achieves better performance, but if the architecture becomes too deep, the error rate increase. The is mostly due to the “vanishing gradient problem”, which states that the gradients are too small to change the weights of deep layers. In 2015, the Microsoft researchers proposed the Residual Network, which is known as ResNet, to solve this problem. ResNet present using skip connection to avoid the vanishing gradient problem, and successfully won the ImageNet Challenge with 152-layer architecture. However, the error rate of 34-layer ResNet is close to the 152-layer version, and researchers have pushed the depth limit to 1,000 layers, but didn’t get significant improvement.

Skip to 1 minute and 18 seconds Here is a side-by-side comparison of major CNN architectures, from the 7-layer LeNet to 152-layer ResNet. Note that the 152 layers to too long to be shown here, so the author selected the 34-layer version instead. Here is the summary table of the major CNN architectures, which is made by Vivienne Sze in MIT. We can see that the total parameters has increased significantly from 60k to around 138M, then started to reduce. Recently some compact models like MobieNet and SeqeezeNet was proposed, which have much less parameters but maintain high accuracy.

Recurrent Neural Networks (RNN)

Continuing on the previous step, Prof. Lai will explain the concept of Residual Neural Network (ResNet).

In 2015, the Microsoft researchers proposed the Residual Network, which is known as ResNet, to solve a famous problem,“vanishing gradient problem”, which states that the gradients are too small to change the weights of deep layers.

Share this video:

This video is from the free online course:

Applications of AI Technology

Taipei Medical University