-
Notifications
You must be signed in to change notification settings - Fork 49
Introduce repeat flag #1413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Introduce repeat flag #1413
Conversation
|
@clessig does this make sense? |
clessig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good. Just two minor comments.
config/default_config.yml
Outdated
| ae_adapter_with_residual: True | ||
| ae_adapter_dropout_rate: 0.1 | ||
|
|
||
| repeat_data: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a more descriptive name here. The future structure with nested dicts will also help here.
| self.perms = np.tile(self.perms, self.samples_per_mini_epoch // len(self.perms)) | ||
| else: | ||
| self.perms = np.tile(self.perms, self.samples_per_mini_epoch // len(self.perms)) | ||
| random_filler = self.rng.choice( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get rid of the branch and have a random_filler of len=0 when it divides?
|
Thanks for the feedback @clessig, should all be addressed now. |
|
|
||
| # check repeat_data flag and fill up perms accordingly | ||
| if self.repeat_data and len(self.perms) < self.samples_per_mini_epoch: | ||
| self.perms = np.tile(self.perms, self.samples_per_mini_epoch // len(self.perms)) | ||
| random_filler = self.rng.choice( | ||
| self.perms, size=self.samples_per_mini_epoch - len(self.perms), replace=False | ||
| ) | ||
| self.perms = np.concatenate([self.perms, random_filler]) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you are already touching this logic, can you make sure the behavior described in #1085 does not persist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have a look – but I think I know where the issue comes from roughly and I should have fixed it already on the diffusion branch. So should be easy to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I was a bit overconfident there. I am not exactly sure where the issue comes from. You mentioned that you have a test suite that checks this – could you share this? Otherwise, in my code self.len and len(self.perms) should always be the same, unless self.len is set to self.chunk in this line here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check, dont worry I will take care of this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks @grassesi.
| fsm = self.forecast_steps[0] | ||
| if len(ds) > 0: | ||
| self.len = min(self.len, len(ds) - (self.len_hrs * (fsm + 1)) // self.step_hrs) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it definitely gets overwritten, but the question is if this is intentional or not. Maybe at some point the idea was to take the minimum between this quantity and int(index_range.end - index_range.start)? Be it intentional or not I would still remove it: If it is unintentional behavior, we should treat it in a separate PR/Issue since any changes in the sampling behavior are quite wide reaching and require thorough testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, thanks for double checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove it from the code then.
clessig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried with
repeat_data_in_mini_epoch
start_date: 202012300000
end_date: 202012310000
and get:
forecast_steps at mini_epoch=0 : 2
Traceback (most recent call last):
File "/users/lessig/santis/WeatherGenerator/src/weathergen/run_train.py", line 176, in train_with_args
trainer.run(cf, devices)
File "/users/lessig/santis/WeatherGenerator/src/weathergen/train/trainer.py", line 342, in run
self.train(mini_epoch)
File "/users/lessig/santis/WeatherGenerator/src/weathergen/train/trainer.py", line 517, in train
for bidx, batch in enumerate(dataset_iter):
^^^^^^^^^^^^^^^^^^^^^^^
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 708, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1480, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1505, in _process_data
data.reraise()
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/_utils.py", line 733, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
^^^^^^^^^^^^^^^^^^^^
File "/users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 42, in fetch
data = next(self.dataset_iter)
^^^^^^^^^^^^^^^^^^^^^^^
File "/users/lessig/santis/WeatherGenerator/src/weathergen/datasets/multi_stream_data_sampler.py", line 720, in __iter__
self.reset()
File "/users/lessig/santis/WeatherGenerator/src/weathergen/datasets/multi_stream_data_sampler.py", line 288, in reset
assert idx_end > 0, "dataset size too small for forecast range"
^^^^^^^^^^^
AssertionError: dataset size too small for forecast range
[6] > /users/lessig/santis/WeatherGenerator/.venv/lib/python3.12/site-packages/torch/_utils.py(733)reraise()
-> raise exception
How did you test?
| fsm = self.forecast_steps[0] | ||
| if len(ds) > 0: | ||
| self.len = min(self.len, len(ds) - (self.len_hrs * (fsm + 1)) // self.step_hrs) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove it from the code then.
config/default_config.yml
Outdated
| ae_adapter_with_residual: True | ||
| ae_adapter_dropout_rate: 0.1 | ||
|
|
||
| repeat_data_in_mini_epoch: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this down to shuffle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (thought I already did) and Yes. Will make another push today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had this error – I always tested with 18 hrs time window, which is the minimum. What did you put for the validation start/end dates @clessig? I have had this error coming from not making the right adjustment for the validation time window before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clessig So what happened I think is that you have forecast_steps and forecast_offset set differently than me. If you try with the attached config, it should work – explanation below.
In the config for diffusion overfitting, we set forecast_steps=1 and forecast_offset=0 (compatible with a date range of 3=2+1+0 time steps, yielding a single source and single target sample). If you want forecast_steps =2 and forecast_offset=1 (as in default config), then I think you need a minimum date range that has at least 5=2+2+1 time steps. It works for example with start_date: 202012300000 and
end_date: 202012310600. This minimum is not affected by my changes (was already there), but evidently the general rule is:
minimum time steps = 2 + forecast_steps + forecast_offset
Hope this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moritzhauschulz thanks for the explanation. can we encode it as a check in the config with an informative message? setting the correct combination of dates, steps and offsets is becoming more and more subtle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encoded this in the new version, with a brief message. Since the corresponding code check has changed slightly again, I suggest to not print a long explanation, and instead let the user refer to the code checks (also to avoid duplication).
|
@moritzhauschulz : can you point it at develop please. How big are the conflicts then? |
commit 9336fe1 Author: moritzhauschulz <[email protected]> Date: Fri Dec 12 20:10:50 2025 +0100 requested changes commit dadde23 Author: moritzhauschulz <[email protected]> Date: Mon Dec 8 18:54:44 2025 +0100 remove 1 line commit c871f9c Author: moritzhauschulz <[email protected]> Date: Mon Dec 8 18:16:50 2025 +0100 remove unnecessary statement commit e3e46eb Author: moritzhauschulz <[email protected]> Date: Mon Dec 8 12:49:03 2025 +0100 lint commit 559add7 Author: moritzhauschulz <[email protected]> Date: Mon Dec 8 12:47:35 2025 +0100 rename flag and simplify cases commit f6e1c39 Author: moritzhauschulz <[email protected]> Date: Thu Dec 4 21:07:42 2025 +0100 reset config and lint commit 27cb0c8 Author: moritzhauschulz <[email protected]> Date: Thu Dec 4 20:57:14 2025 +0100 repeat flag commit bf17bfe Author: Christian Lessig <[email protected]> Date: Thu Dec 4 16:53:51 2025 +0100 Updated config commit 7745e47 Author: Christian Lessig <[email protected]> Date: Thu Dec 4 16:35:19 2025 +0100 Switched to lists of model / target stratgies commit 12bae15 Author: Christian Lessig <[email protected]> Date: Thu Dec 4 15:01:07 2025 +0100 Fixes for diffusion commit 9065219 Author: Christian Lessig <[email protected]> Date: Thu Dec 4 13:33:42 2025 +0100 Changed that model takes sample as input commit 3f52a8d Author: Christian Lessig <[email protected]> Date: Thu Dec 4 13:32:14 2025 +0100 Changed core functions to take sample as arg commit d36367a Author: Christian Lessig <[email protected]> Date: Thu Dec 4 13:31:55 2025 +0100 Changed args to embedding commit b69b743 Author: Christian Lessig <[email protected]> Date: Thu Dec 4 13:30:41 2025 +0100 Cleaned up comments and return values a bit commit 59510dd Author: Christian Lessig <[email protected]> Date: Thu Dec 4 00:01:50 2025 +0100 Fixed problem with non_blocking=True commit 69b53a6 Author: Christian Lessig <[email protected]> Date: Thu Dec 4 00:00:42 2025 +0100 Removed old comments commit 51754fa Author: Christian Lessig <[email protected]> Date: Thu Dec 4 00:00:20 2025 +0100 Fixed missing non_blocking=True in to_device() commit 2cd3971 Author: Christian Lessig <[email protected]> Date: Wed Dec 3 23:56:41 2025 +0100 Completed migration to new batch class by removing reference to old list of lists commit 402b8de Author: Julian Kuehnert <[email protected]> Date: Wed Dec 3 17:11:15 2025 +0100 1390 - Adapt forward pass of new batch object (ecmwf#1391) * Add to device to ModelBatch, etc & adapt model TODO adapt validate and inference TODO test forecasting and multiple stream because predict changed substantially * Rename view to sample and fix validate * Revert predict function and fix inference * Fix invalid access with mask * Linting * Fixed handling of target_idxs and other minor issues --------- Co-authored-by: sophiex <[email protected]> Co-authored-by: Christian Lessig <[email protected]> commit 9a1a6a9 Author: Christian Lessig <[email protected]> Date: Wed Dec 3 13:12:52 2025 +0100 Re-enabled multi-source training commit 3641e1f Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:20:42 2025 +0100 Fix for integration test commit 9f5e49c Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:20:25 2025 +0100 Fixed uv.lock commit 33d9d8d Merge: 23e0267 c8a2aad Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:13:05 2025 +0100 Merge branch 'shmh40/dev/1270-idx-global-local' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit 23e0267 Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:11:48 2025 +0100 Update commit c8a26d7 Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:11:37 2025 +0100 Commit commit 2599ec2 Author: Christian Lessig <[email protected]> Date: Wed Dec 3 00:10:13 2025 +0100 Restructured code so that mask generation and application is cleanly separated commit c8a2aad Author: Tim Hunter <[email protected]> Date: Tue Dec 2 17:06:56 2025 +0100 commenting tests commit 2b2c977 Author: Tim Hunter <[email protected]> Date: Tue Dec 2 17:03:41 2025 +0100 linter warnings commit dc736e5 Merge: 6fe8561 7ff6e0b Author: Tim Hunter <[email protected]> Date: Tue Dec 2 16:48:24 2025 +0100 merge with dev commit 6fe8561 Merge: 15b46e9 f136d60 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 14:16:41 2025 +0100 Merge branch 'develop' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit 15b46e9 Author: Sebastian Hickman <[email protected]> Date: Fri Nov 28 13:30:54 2025 +0100 fix indentation of else: assert False in _get_sample msds commit 4281aff Author: Sebastian Hickman <[email protected]> Date: Fri Nov 28 12:40:24 2025 +0100 restore loader_num_workers to 8 commit 6ea07e7 Author: Seb Hickman <[email protected]> Date: Fri Nov 28 11:34:41 2025 +0000 restore masking_strategy to random Had placeholder for testing, now back to "random" for masking strategy in the base level of default_config commit 1a37dd1 Author: Sebastian Hickman <[email protected]> Date: Fri Nov 28 10:31:43 2025 +0100 remove unused mask generation in diffusion_forecast commit 657094a Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:59:39 2025 +0100 Fixed problem in engines introduced in recent commits merging develop. This fixes masking training commit d526dfc Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:37:02 2025 +0100 Restored masking as training mode. Not working due to NaN in prediction commit 6289959 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:36:38 2025 +0100 Removed duplicate lines due to mergeing commit bc8d23e Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:18:01 2025 +0100 More linting commit 47750a5 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:10:09 2025 +0100 Restoring masking as training_mode in default_config commit 0db8b62 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:09:41 2025 +0100 Linting commit e41a575 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:09:28 2025 +0100 Linting commit 03166a2 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:09:10 2025 +0100 Linting commit 652500a Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:08:53 2025 +0100 Linting commit d8998a9 Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:08:38 2025 +0100 Linting commit 8ef3a4c Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:08:04 2025 +0100 Simplified and clarified handling of default target_aux_calcualtor commit 3e4de7a Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:07:51 2025 +0100 Linting commit 5f803e5 Merge: b47b0fa 0e2801b Author: Christian Lessig <[email protected]> Date: Fri Nov 28 08:03:02 2025 +0100 Merge branch 'develop' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit b47b0fa Merge: 9b702c5 26f7b5b Author: Christian Lessig <[email protected]> Date: Fri Nov 28 07:09:19 2025 +0100 Merge branch 'shmh40/dev/1270-idx-global-local' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit 26f7b5b Author: Sebastian Hickman <[email protected]> Date: Thu Nov 27 15:33:22 2025 +0100 add diffusion forecast option for the data sampling, and with noise_level_rn in the metadata. The Trainer needs to be copied from Sophies branch, currently we only get so far commit 6d909d6 Author: Sebastian Hickman <[email protected]> Date: Thu Nov 27 11:32:32 2025 +0100 add mask to SampleMetaData and add forecast_dt to Sample so it is accessible. Can specify the loss in the default config with student-teacher views commit e0d7346 Author: Sebastian Hickman <[email protected]> Date: Wed Nov 26 14:31:52 2025 +0100 remove prints, pdb commit c27156c Author: Sebastian Hickman <[email protected]> Date: Wed Nov 26 12:35:03 2025 +0100 add SampleMetaData integration and functionality, and update masker to use SampleMetadata. Pass through source_cell_lens and target_coords_idx to student_teacher_batch in iter, and hence pass through to trainer. source_cell_lens and target_coords_idx are now part of Sample, which is itself the components of ModelBatch. To tidy commit 4f8f62b Author: Sebastian Hickman <[email protected]> Date: Tue Nov 25 18:56:56 2025 +0100 instructions for sophie commit fa24fc1 Author: Sebastian Hickman <[email protected]> Date: Tue Nov 25 16:36:52 2025 +0100 very hacky first pass of full masking_strategy_config for the student and teacher views. Much to fix up commit b193a50 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 24 17:13:37 2025 +0100 updated configs so code runs. Note default config to be overhauled still commit af9a3c1 Merge: 2905cb0 b452bd2 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 24 16:37:55 2025 +0100 merge with develop, include trainer idx_inv_rt, merged default_config, rm tokenizer_forecast commit 2905cb0 Author: Sebastian Hickman <[email protected]> Date: Sat Nov 22 13:59:37 2025 +0000 fix masking for NPP-ATMS by correctly selecting final timestep mask and aligning between source and target. working for num_input_steps = 1, broken for > 1, compute_offsets_scatter_embed not working commit b9a60f3 Author: Sebastian Hickman <[email protected]> Date: Fri Nov 21 18:38:40 2025 +0000 tidy up, remove unused arguments, types commit ece1dd0 Author: Sebastian Hickman <[email protected]> Date: Fri Nov 21 16:22:27 2025 +0000 move build_views_for_stream into masker commit 1a418bf Author: Sebastian Hickman <[email protected]> Date: Fri Nov 21 12:54:33 2025 +0000 add max_num_samples functionality to tokenizer_masking and pass through in multi_stream_data_sampler. coords_per_cell is a bit nasty commit 91c3d7a Author: Sebastian Hickman <[email protected]> Date: Fri Nov 21 12:53:31 2025 +0000 add max_num_targets to era5 commit 647e4b2 Author: Sebastian Hickman <[email protected]> Date: Thu Nov 20 18:31:45 2025 +0000 multiple idxs for each teacher, need to confirm for not student case, and updated ModelBatch for this commit 1806ae5 Author: Sebastian Hickman <[email protected]> Date: Thu Nov 20 16:28:30 2025 +0000 tidy up, remove unused build_stream_views in tokenizer_masking commit 9b702c5 Author: Christian Lessig <[email protected]> Date: Thu Nov 20 14:34:34 2025 +0100 Re-enabling inversion of targert ordering. commit 87ad45f Author: Sebastian Hickman <[email protected]> Date: Thu Nov 20 13:10:34 2025 +0000 add teacher num_views parameter to config commit b34b6da Author: Sebastian Hickman <[email protected]> Date: Thu Nov 20 13:09:19 2025 +0000 collect num_source_samples and num_target_samples, add loop over teacher masks hence allowing multiple teacher views, and add source_target_idx to keep track of which student belongs to which teacher commit b2be982 Author: Sebastian Hickman <[email protected]> Date: Thu Nov 20 13:07:47 2025 +0000 fix typo in ModelBatch commit d18cf86 Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:26:40 2025 +0100 Added todo commit e8ccb8d Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:22:26 2025 +0100 Added required reflexivity between source and target samples to Batch commit 5d5e999 Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:21:31 2025 +0100 Linting problems but removed unused ViewMetaData dependence commit 3bca490 Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:21:13 2025 +0100 linting commit 6a96065 Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:20:42 2025 +0100 Linting commit c1d32fb Author: Christian Lessig <[email protected]> Date: Thu Nov 20 08:20:21 2025 +0100 linting commit 1b1654c Author: Christian Lessig <[email protected]> Date: Wed Nov 19 22:32:05 2025 +0100 Added basic support for use of ModelBatch class to define rough structure and interface. commit 848880b Author: Christian Lessig <[email protected]> Date: Wed Nov 19 20:06:41 2025 +0100 Renaming and minor clean up. commit 6d685c0 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 19:57:46 2025 +0100 Moved _get_student_teacher_masks() so that masks are generated for all streams first. commit ed26c02 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 19:57:23 2025 +0100 Changes to have spoofing on a per data reader sample commit 9fe94f5 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 19:30:48 2025 +0100 Changes necessary for spoofing flag per IOReaderData commit 4613f7a Author: Christian Lessig <[email protected]> Date: Wed Nov 19 17:58:10 2025 +0100 Cleaned up parametrization commit 1235aab Author: Christian Lessig <[email protected]> Date: Wed Nov 19 17:47:40 2025 +0100 More refactoring. Code working again. commit 1e70f5c Author: Christian Lessig <[email protected]> Date: Wed Nov 19 17:09:20 2025 +0100 More refactoring and cleanup commit 46147d4 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 17:01:29 2025 +0100 More refactoring commit 81cf929 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 15:58:57 2025 +0100 Changes for better student teacher structure commit dfc03f2 Merge: a824bfc 31dc658 Author: Christian Lessig <[email protected]> Date: Wed Nov 19 15:58:37 2025 +0100 Merge branch 'shmh40/dev/1270-idx-global-local' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit a824bfc Author: Christian Lessig <[email protected]> Date: Wed Nov 19 12:23:47 2025 +0100 Not working draft for restructuring commit 31dc658 Author: Sebastian Hickman <[email protected]> Date: Wed Nov 19 11:04:29 2025 +0000 created function for _get_student_teacher_sample_data which returns the streams_data of the teacher and multiple streams_datas for the student views. commit 2536cec Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:40:26 2025 +0000 correct imports with new batch.py commit b3dfa2f Merge: 11ad4e6 c1580c4 Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:36:15 2025 +0000 merge changes commit 11ad4e6 Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:34:19 2025 +0000 basic if statement to yield the student and teacher views commit 36ea287 Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:33:53 2025 +0000 slight restructure of ViewMetadata commit 66cf9cd Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:33:08 2025 +0000 added stream id to era5 config commit 3c26ddc Author: Sebastian Hickman <[email protected]> Date: Tue Nov 18 17:32:00 2025 +0000 updated default config training_config to allow student-teacher commit c1580c4 Author: Christian Lessig <[email protected]> Date: Tue Nov 18 16:30:44 2025 +0100 Renaming commit 85fa139 Author: Christian Lessig <[email protected]> Date: Tue Nov 18 16:28:46 2025 +0100 Comments commit dd6f85a Author: Christian Lessig <[email protected]> Date: Tue Nov 18 15:30:22 2025 +0100 Added mode and refactored get_sample_data into separate function. commit 668912d Author: Christian Lessig <[email protected]> Date: Tue Nov 18 13:47:40 2025 +0100 Partially enabled correct handling of multiple input steps. commit c3b5c3b Author: Christian Lessig <[email protected]> Date: Tue Nov 18 12:02:17 2025 +0100 Added basic support for multi-step sources. commit ab9eecc Merge: a934f97 c733280 Author: Christian Lessig <[email protected]> Date: Tue Nov 18 10:00:37 2025 +0100 Merge branch 'shmh40/dev/1270-idx-global-local' of github.com:ecmwf/WeatherGenerator into shmh40/dev/1270-idx-global-local commit a934f97 Author: Christian Lessig <[email protected]> Date: Tue Nov 18 09:58:19 2025 +0100 NOT WORKING: updating class to handle multiple input steps and improving overall structure commit c733280 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 17 18:32:40 2025 +0000 change view_metadata to dict in ModelInput commit 7d5c300 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 17 18:22:33 2025 +0000 draft of training_config in default_config commit 047b299 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 17 18:19:56 2025 +0000 draft changes to allow global local view generation in masker and tokenizer_masking. generate the mask, otherwise using batchify_source and batchify_target as before, with the capacity to remember what mask we have now when it comes to generating the targets. Update to inputs_metadata structure but not put in to practice commit 761e263 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 17 18:13:57 2025 +0000 update ViewMetadata spec commit 7f3c718 Author: Christian Lessig <[email protected]> Date: Mon Nov 17 14:51:01 2025 +0100 Updating config to working version commit ae5a2e6 Author: Sebastian Hickman <[email protected]> Date: Mon Nov 17 11:54:18 2025 +0000 added file with ModelBatch and SampleMetadata dataclasses commit debbb8f Author: Christian Lessig <[email protected]> Date: Mon Nov 17 12:28:07 2025 +0100 Changes to prepare_logging to apply index inversion commit 5d127bf Author: Christian Lessig <[email protected]> Date: Sun Nov 16 17:01:08 2025 +0100 Inversion of target output ordering to match input one in forcast mode. Unclear how to deal with it with MTM commit 8fa544d Author: Christian Lessig <[email protected]> Date: Fri Nov 14 20:43:57 2025 +0100 Removed unused parameters commit ce6c735 Author: Christian Lessig <[email protected]> Date: Fri Nov 14 16:56:51 2025 +0100 Removing centroids options for embedding that was unused and should not be used. commit 0634105 Author: Christian Lessig <[email protected]> Date: Fri Nov 14 09:59:13 2025 +0100 Enabled support for forecast. Cleaned up some bits and pieces. commit ec38123 Author: Christian Lessig <[email protected]> Date: Fri Nov 14 08:27:21 2025 +0100 Fixed remaining problems that occured for NPP-ATMS and SYNOP. TODO: - Forecast still needs to be adapted - Some more cleanup of variable naming, return values etc commit db6f285 Author: Christian Lessig <[email protected]> Date: Thu Nov 13 23:26:31 2025 +0100 Fixed linting commit 9229e48 Author: Christian Lessig <[email protected]> Date: Thu Nov 13 23:19:21 2025 +0100 Minor cleanup commit a581405 Author: Christian Lessig <[email protected]> Date: Thu Nov 13 23:17:29 2025 +0100 Working version for ERA5, NPP-ATMS. Problems with SYNOP with empty cell handling commit e4a9cc0 Author: Christian Lessig <[email protected]> Date: Thu Nov 13 18:58:28 2025 +0100 Masking target is working in principle but errors when feeding data to the model. commit 51f437f Author: Christian Lessig <[email protected]> Date: Thu Nov 13 07:04:23 2025 +0100 NOT WORKING: Finished src, target still to be done. commit 81bd6eb Author: Christian Lessig <[email protected]> Date: Wed Nov 12 09:38:53 2025 +0100 NOT WORKING: initial draft for index-based masking. Implemented for random and healpix masking. Open issues with _coords_local, centroids and probably other things.
9336fe1 to
535b5d5
Compare
|
Fixes #1379 |
Description
Introducing a repeat flag, which fills up the
samples_per_mini_epochwhere the dataset has fewer elements. This is done using tiling, whereas in case ofsamples_per_mini_epochnot being divisible by the dataset size, the final 'remainder' tile is sampled without replacement from the dataset. Pretty simple, see code.Issue Number
Closes #1379
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60