Difference between revisions of "2017 Winter Project Week/DeepLearningMethodology"
From NAMIC Wiki
(Created page with "This is a 3 hour introductory course on Deep Learning Methodology for Project Week #24. Instructor: Mohsen Ghafoorian ==Basic concepts: (60-75 min)== *loss function *stoc...") |
|||
Line 5: | Line 5: | ||
==Basic concepts: (60-75 min)== | ==Basic concepts: (60-75 min)== | ||
− | *loss function | + | *loss function (categorical cross entropy, MSE) |
*stochastic gradient descent | *stochastic gradient descent | ||
− | *update rules | + | *update rules (SGD issue, Momentum, Nestrov, Adadelta, RMSProp, Adam) |
*learning rate | *learning rate | ||
*activation functions | *activation functions | ||
*why non-linearities? | *why non-linearities? | ||
− | *Sigmoid (vanishing | + | *Sigmoid (vanishing gradient problem, non-zero centered features), tanh |
− | *relu (dead relu), leaky relu, prelu | + | *relu (dead relu issue), leaky relu, prelu |
*weight initialization | *weight initialization | ||
*regularization | *regularization | ||
*augmentation | *augmentation | ||
− | *L1 | + | *L1/L2 |
− | |||
*dropout | *dropout | ||
*batch norm | *batch norm | ||
− | *network babysitting | + | *network babysitting (bad learning rate, bad initialization, overfitting) |
==State of the art CNN methods: (60 min)== | ==State of the art CNN methods: (60 min)== |
Latest revision as of 06:14, 10 January 2017
Home < 2017 Winter Project Week < DeepLearningMethodologyThis is a 3 hour introductory course on Deep Learning Methodology for Project Week #24.
Instructor: Mohsen Ghafoorian
Basic concepts: (60-75 min)
- loss function (categorical cross entropy, MSE)
- stochastic gradient descent
- update rules (SGD issue, Momentum, Nestrov, Adadelta, RMSProp, Adam)
- learning rate
- activation functions
- why non-linearities?
- Sigmoid (vanishing gradient problem, non-zero centered features), tanh
- relu (dead relu issue), leaky relu, prelu
- weight initialization
- regularization
- augmentation
- L1/L2
- dropout
- batch norm
- network babysitting (bad learning rate, bad initialization, overfitting)
State of the art CNN methods: (60 min)
- alexnet
- vgg net
- google net
- resnet
- highway nets
- dense nets
- GANs
Biomedical segmentation
- sliding window
- fully convolutional nets
- Unet