# AI News, BOOK REVIEW: How to train your Deep Neural Network

- On 4. juni 2018
- By Read More

## How to train your Deep Neural Network

There are certain practices in Deep Learning that are highly recommended, in order to efficiently train Deep Neural Networks.

In this post, I will be covering a few of these most commonly used practices, ranging from importance of quality training data, choice of hyperparameters to more general tips for faster prototyping of DNNs.

Most of these practices, are validated by the research in academia and industry and are presented with mathematical and experimental proofs in research papers like Efficient BackProp(Yann LeCun et al.) and Practical Recommendations for Deep Architectures(Yoshua Bengio).

For more in-depth understanding, I highly recommend you to go through the above mentioned research papers and references provided at the end.

But, it’s not completely old school to say that - “given the right type of data, a fairly simple model will provide better and faster results than a complex DNN”(although, this might have exceptions).

better alternative is a tanh function - mathematically, tanh is just a rescaled and shifted sigmoid, tanh(x) = 2*sigmoid(x) - 1.

On the other hand, while keeping smaller numbers of hidden units(than the optimal number), there are higher chances of underfitting the model.

Also, while employing unsupervised pre-trained representations(describe in later sections), the optimal number of hidden units are generally kept even larger.

By increasing the number of hidden units, model will have the required flexibility to filter out the most appropriate information out of these pre-trained representations.

Furthermore, while using sigmoid activation functions, if weights are initialized to very large numbers, then the sigmoid will saturate(tail regions), resulting into dead neurons.

- weights drawn from ~ Uniform(-r, r) where r=sqrt(6/(fan_in+fan_out)) for tanh activations, and r=4*(sqrt(6/fan_in+fan_out)) for sigmoid activations, where fan_in is the size of the previous layer and fan_out is the size of next layer.

Set the learning rate too small and your model might take ages to converge, make it too large and within initial few training examples, your loss might shoot up to sky.

In contrast to, a fixed learning rate, gradually decreasing the learning rate, after each epoch or after a few thousand examples is another option.

Generally, learning rate can be halved after each epoch - these kinds of strategies were quite common a few years back.

Fortunately, now we have better momentum based methods to change the learning rate, based on the curvature of the error function.

Methods like Adagrad or Adam, effectively save us from manually choosing an initial learning rate, and given the right amount of time, the model will start to converge quite smoothly(of course, still selecting a good initial rate will further help).

There are two ways to go about it: Good old Stochastic Gradient Descent might not be as efficient for DNNs(again, not a stringent rule), lately there have been a lot of research to develop more flexible optimization algorithms.

In addition to providing adaptive learning rates, these sophisticated methods also use different rates for different model parameters and this generally results into a smoother convergence.

Major objective of training a model is to learn appropriate parameters, that results into an optimal mapping from inputs to outputs.

While employing a stochastic learning approach, gradients of weights are tuned after each training sample, introducing noise into gradients(hence the word ‘stochastic’).

Usually, batch size is selected, once you have already found more important hyperparameters(by manual search or random search).

out, there is a simple strategy for this - Just keep on training your model for a fixed amount of examples/epochs, let’s say 20,000 examples or 1 epoch.

After each set of these examples compare the test error with train error, if the gap is decreasing, then keep on training.

In order to save yourself from bouts of hysteria, in such situations(which might be quite justified ;)) - always visualize the training process.

If you think, you are patient as a stone, you might try running a DNN on your laptop(which can’t even open 10 tabs in your Chrome browser) and wait for ages to get your results.

So, instead of taking weeks on a normal machine, these parallelization techniques, will bring down the training time to days, if not hours.

- On 24. september 2021

**Lecture 6 | Training Neural Networks I**

In Lecture 6 we discuss many practical issues for training modern neural networks. We discuss different activation functions, the importance of data ...

**Image Detection with YOLO-v2 (pt.8) Custom Object Detection (Train our Model!)**

In this series we will explore the capabilities of YOLO for image detection in python! This video will look at - how to modify our the tiny-yolo-voc.cfg model file ...

**How to Predict Stock Prices Easily - Intro to Deep Learning #7**

We're going to predict the closing price of the S&P 500 using a special type of recurrent neural network called an LSTM network. I'll explain why we use ...

**5 Exercise Methods That Burn Belly Fat Faster**

These are 5 exercise methods that will help you burn belly fat faster with a proper diet. There is no way to target fat burn, but there are ways to build more muscle ...

**How To TRAIN and EAT Based On Your BODY TYPE?? (Ectomorph, Mesomorph, Endomorph)**

The best female fat loss workout can be found in our 90 day fitness and nutrition program Want to learn how to train for ..

**Learn to Burst Train and Lose Fat Quickly with Dr. Holly Clemens**

Learn how to lose weight quickly with only minutes of exercise a day! It's called burst training or surge training. Utilize this quick and easy concept to literally burn ...

**Never, Ever Give Up. Arthur's Inspirational Transformation!**

If this story can inspire someone you know, please share it with them! Arthur Boorman was a disabled veteran of the Gulf War for 15 years, and was told by his ...

**People Work Out With Instagram Fitness Trainers For 30 Days**

Special Thanks To: Website: Instagram: Credits: .

**10 Best Exercises to Lose Weight at Home**

These are the 10 best exercises to lose weight fast at home. If your goal is to lose belly fat or to lose weight this workout is specifically for weight loss. This fat ...

**#1 Fat Burning Tip: Burn Body Fat and Lose Weight Fast (2 Week Challenge)!**

Big Boy Billy Makeover: Target Heart Rate Calculation: ..