Xiaoyu SONG My personal Blog | Xiaoyu SONG

<< All Blogs

Week #19

Learning rate tuning

Tuning the learning rate in order to find the suitable learning rate strategy for slice 5.
I tried three strategy of learning rate using torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1) in Pytorch, where the gamma represent the factor of learning rate decay, step_size indicates the Period of learning rate decay.

  • learning_rate’s initial value: 0.002, gamma: 0.5, step_size: 150
  • learning_rate’s initial value: 0.001, gamma: 0.8, step_size: 100
  • no scheduler, learning_rate: 0.001
  • epoch 0-150, lr=0.002
  • epoch 150-300, lr=0.001
  • epoch 300-450: lr = 0.0005
  • epoch 450-600: lr = 0.00025
  • epoch 600-750: lr = 0.000125
  • epoch 0-100: lr = 0.001,
  • epoch 100-200: lr = 0.0008
  • epoch 200-300: lr = 0.00064
  • epoch 300-400: lr = 0.000512
  • epoch 400-500: lr = 0.0004096
  • epoch 500-600: lr = 0.00032768
  • lr = 0.001

Since observing that the loss doesn’t decrease efficiently after 300 epochs, another strategy is used by torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, last_epoch=-1), whiche means for certain periods of epochs, the learning rate stays constant.

  • learning_rate’s initial value: 0.001, gamma = 0.1, milestones=[200,400], which means
  • lr = 0.001 if epoch < 200
  • lr = 0.0001 if 200 < epoch < 400
  • lr = 0.00001 if epoch > 400

Another approach

In DeepNAT[1], they separates foreground from background before the segmentation, in our case, the background represent a very large amount of pixels, as a result, it’s necessary to first remove the background. There is a simple way implemented by QuickNAT in Github, it simply add up all the pixels whose class frequency is less than a threshold. The same strategy will be used.

References

[1]C. Wachinger, M. Reuter, and T. Klein, “DeepNAT: Deep Convolutional Neural Network for Segmenting Neuroanatomy,” NeuroImage, vol. 170, pp. 434–445, Apr. 2018.