As a form of regression testing, we should be keeping track of how simulation results change over time. As we update the model and make changes to the connector code, we should be able to see how each change affects the output.
To start with, we can look at the Imperial model. See #40 for how we can generate the comparisons. This issue is about automating the process for one or more sets of runs and doing this comparison daily.
For each of these runs, choose a region, a set of interventions and a calibration date (along with the death count). Perform the run in a github action. After the run completes, perform a comparison with the previous day's run. Store the results as an artifact. And if the results indicate a large enough difference, fail the run.
As a form of regression testing, we should be keeping track of how simulation results change over time. As we update the model and make changes to the connector code, we should be able to see how each change affects the output.
To start with, we can look at the Imperial model. See #40 for how we can generate the comparisons. This issue is about automating the process for one or more sets of runs and doing this comparison daily.
For each of these runs, choose a region, a set of interventions and a calibration date (along with the death count). Perform the run in a github action. After the run completes, perform a comparison with the previous day's run. Store the results as an artifact. And if the results indicate a large enough difference, fail the run.