How To Debug Losses In Deep Learning

DRAFT

I have been recently faced with the question of how to test a loss function is working correctly. The answer seems to be easy at first: just run it and test. But there is a reason why that won’t work. First: if your data takes many $$ and LONG TIME to train, you just can’t affort waiting and spending till you make sure. Second: even if your data is small enough, you probably won’t be able to do fast enough debugging iterations on any meaninfgul data. Third: this only tells that your loss does not work, not WHY it does not work.

So, below is what I currently know about how to debug losses for deep learning, please feel free to mail me with your thoughts/additions.

1. Correct numerical values

Typically, for losses, your have some external implementation of the loss. If you don’t have, make one, and compare its output values to the values produced by your loss. This will not only show you if you have implemented the loss as you expected, but if you possibly are connecting your input to your output in a wrong way.

2. Connectivity and graph

3. The ultimate test: one-sample-overfitting-test (suggested to me by Ahmed Saleh)

Bonus: Gradient Vanishing

Written on February 25, 2022