Align jepa forecast finetuning#2149
Conversation
| latent_noise_deterministic_latents: True | ||
|
|
||
| freeze_modules: ".*encoder.*|.*latent_pre_norm.*|.*latent_heads.*" | ||
| freeze_modules: "" |
There was a problem hiding this comment.
We should keep these modules frozen as default also :)
| fe_layer_norm_after_blocks: [] # Index starts at 0. Thus, [3] adds a LayerNorm after the fourth layer | ||
| fe_impute_latent_noise_std: 0.0 # 1e-4 | ||
| fe_layer_norm_after_blocks: [7] # Index starts at 0. Thus, [3] adds a LayerNorm after the fourth layer | ||
| fe_impute_latent_noise_std: 1e-4 |
There was a problem hiding this comment.
Sorry, could we actually leave the latent noise as 0 for now!
| ##################################### | ||
|
|
||
| streams_directory: "./config/streams/era5_1deg/" | ||
| streams_directory: "./config/streams/era5_1deg_forecasting/" |
| lr_start: 1e-6 | ||
| lr_max: 5e-5 | ||
| lr_final_decay: 1e-6 | ||
| lr_final_decay: 2e-6 |
| training_mode: ["masking"] | ||
|
|
||
| num_mini_epochs: 32 | ||
| num_mini_epochs: 64 |
There was a problem hiding this comment.
@sophie-xhonneux probably this one we can leave as 32 epochs?
shmh40
left a comment
There was a problem hiding this comment.
Great, thanks! Just a few changes needed and then let's wait for @sophie-xhonneux and @MatKbauer to double check too.
shmh40
left a comment
There was a problem hiding this comment.
Also can you create an issue and link to it in the PR so that the checks pass :)
Thanks, will create one now! |
| with_mixed_precision: True | ||
| with_flash_attention: True | ||
| compile_model: False | ||
| with_fsdp: False |
There was a problem hiding this comment.
@shmh40 @sophie-xhonneux
Should we set with_fsdp: True, as in config_forecasting.yml?
There was a problem hiding this comment.
Let's leave it as False for now, thanks!
Description
Align
config_jepa_forecasting_finetuning.ymlwithconfig_forecasting.ymlfor fair comparison in the future.Issue Number
Fixes #2150
Is this PR a draft? Mark it as draft.
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60