Papers
Recommended papers
- Learning rate - pretty much solved the issue of adapting learning rate
- Bag of tricks - Interesting paper about CNNs, which are the bulding block of resnets and other architectures used in vision
- Attention is all you need - A very important paper about the self attention mechanism which is at the base of the transformers architecture