-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
Freezing layers at the beginning of training works, however unfreezing in on_epoch_start() during training causes the gradient to explode. Without the unfreezing part (or without freezing at all), the model trains fine with no gradient issues.
I'm using DDP + Apex O2 and the loss scaling will keep going down to 0 where it would encounter 0 division and crash.
Is unfreezing during training not possible in pytorch/lightning? or am I missing snippet?
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested