Models/evaluate year by tommylees112 · Pull Request #124 · ECMWFCode4Earth/ml_drought

tommylees112 · 2019-09-04T11:19:23Z

We are trying here to use the TRAINED model to evaluate specific months.

The major NEW functions are:
src/models/base.py: evaluate_train_timesteps

And changes to:
src/models/neural_networks/base.py: predict

adding the ability to choose specific year/months to run the predictions

NOTE: need to update all of the models (non-neural networks too)

gabrieltseng · 2019-09-05T09:42:51Z

@tommylees112 , do you want me to take a look at this now, or once the tests are passing?

tommylees112 · 2019-09-06T15:12:59Z

when tests are passing please buddy

tommylees112 · 2019-09-06T15:54:16Z

This currently isn't working for me even though it is passing the test. will need to have a play!

gabrieltseng

Hey Tommy - I know things aren't working yet, but took a quick look and have some initial comments.

I'll also take a look once everything is working!

gabrieltseng · 2019-09-07T03:48:01Z

src/models/data.py

        Whether to load testing or training data. This also affects the way the data is
        returned; for train, it is a concatenated array, but for test it is a dict with dates
        so that the netcdf file can easily be reconstructed
+    >>>>>>>>>CHANGE


typo: to remove

gabrieltseng · 2019-09-07T03:49:05Z

src/models/data.py

+                       test_month: Optional[int] = None) -> List[Path]:

-        data_folder = data_path / f'features/{experiment}/{mode}'
+        if (test_year is None) and (test_month is None):


This API is confusing - there is a year and month associated with the test data too (i.e. if I pass test_year=2018 in our current setup, it wouldn't work).

Perhaps something that makes sense would be a function which determines which years are in the test folder, and which years are in the train folder (which shouldn't be too hard, since that information is in the folder names).

Also, if test_year is passed, but test_month is not (which would be useful to predict all of 2018), things wouldn't work.

Maybe limiting the granularity to years? And allowing a range of years to be passed?

(I see now that this is handled in the base training class, but since the DataLoader is something we would want to expose to users, I think it makes sense to clean up here)

can we chat through how to improve this API ? So far I have the ideas:

Have a function to check whether the file exists in the train datafolder

Expose this to the user from the DATALOADER not the base neural_network class

Require making predictions of whole years (don't expose months to users)

Did i miss any?

for 3., would it be possible to expose both, but if test_month is None, then do the whole year?

gabrieltseng · 2019-09-07T03:56:08Z

src/models/neural_networks/base.py

-    def predict(self) -> Tuple[Dict[str, Dict[str, np.ndarray]], Dict[str, np.ndarray]]:
-
-        test_arrays_loader = DataLoader(data_path=self.data_path, batch_file_size=self.batch_size,
+    def predict(self, test_year: Optional[int] = None,


Same comment as above regarding the confusing interface

Also, this functionality needs to be added for the regression and parsimonious model

gabrieltseng · 2019-09-07T04:01:47Z

src/models/base.py

+            else:
+                preds_xr.to_netcdf(self.model_dir / f'preds_{key}.nc')
+
+    def evaluate_train_timesteps(self, years: List[int],


See comments in the DataLoader class

tommylees112 added 12 commits August 28, 2019 10:56

add to_dataframe function for getting padnas dataframe

5c515d0

clean innit file

5744ca6

update notebooks 1

56bf326

change default plot arg to VCI

62645ae

update analysis.ipynb

11d5870

update the notebook

b5d1889

merge from master

9d906cf

update pytest

4cb95f8

add read_true_data function to evaluation

e8d95b6

implement the evaluate_train_timesteps function

96a2c93

fix flake errors

ad01bf9

notebook exploration

3f24f2e

tommylees112 requested a review from gabrieltseng September 4, 2019 11:19

tommylees112 added 3 commits September 6, 2019 16:21

update tests

a503f15

update mypy

b02b94a

fix pytest

078d4cc

tommylees112 added model validation modelling wip Work in progress - not ready for merging labels Sep 6, 2019

gabrieltseng requested changes Sep 7, 2019

View reviewed changes

Conversation

tommylees112 commented Sep 4, 2019

Uh oh!

gabrieltseng commented Sep 5, 2019

Uh oh!

tommylees112 commented Sep 6, 2019

Uh oh!

tommylees112 commented Sep 6, 2019

Uh oh!

gabrieltseng left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants