Convolutional Neural Networks (CNNs) are widely used in Machine Learning (ML) systems. While deep CNNs (utilizing many “neuron layers”) are very effective in extracting features from raw pixel values and have achieved amazing performance for classification and segmentation tasks in computer vision, they are suffering from problems affecting their accuracy and speed.
Deep CNNs are subject to a number of problems including loss of learned features between layers and the well-known “Vanishing Gradient Problem” which adds to the difficulty of training the ML system if not making it impossible.
Description of the invention
Deploying a novel architecture, researchers at the University of Waterloo have successfully designed a unique CNN that is capable of retaining rich information fused in its hidden layers thus, greatly alleviating the gradient-vanishing problem while avoiding the overfitting risk. Due to its unique construct, the new CNN’s activation function is capable of approximating very complex functions. The new CNN architecture along with its uniquely designed activation function has enabled the network to perform even better than the state-of-the-art while requiring much lesser parameters and “shallower structure” (lesser layers).
Due to its unique architecture and activation function, the new deep CNN requires an order of magnitude lesser training parameters and can be implemented in a “shallower” structure. This means that the CNN is fast and requires much lesser computational power while delivering the same performance as the state-of-the-art.
Because the new CNN requires much lesser parameters and is light-weighted, it can be used in:
- Real time video/image classification
- Autonomous driving