Skip to content

Fix indexing of FieldTimeSeries with OnDisk backend and specified times#5515

Open
matt-graham wants to merge 2 commits intoCliMA:mainfrom
matt-graham:fix-ondisk-fts-time-indexing
Open

Fix indexing of FieldTimeSeries with OnDisk backend and specified times#5515
matt-graham wants to merge 2 commits intoCliMA:mainfrom
matt-graham:fix-ondisk-fts-time-indexing

Conversation

@matt-graham
Copy link
Copy Markdown
Contributor

Resolves #5505

Updates indexing logic in getindex method for OnDiskFTS to check if time in file timeseries/t group at specified index n matches time at index n in times attribute and if not try to find matching index / iteration key corresponding to this time. If an exact match for the time cannot be found an error is raised. Also adds tests to check indexing of FieldTimeSeries with different backends and specified times works as expected.

Copy link
Copy Markdown
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thank you!

@glwagner
Copy link
Copy Markdown
Member

There's a conflict --- let me know if you need help resolving. Also, we can bump the patch version since this usefully resolve a bug

Comment thread src/OutputReaders/field_time_series_indexing.jl Outdated
@matt-graham
Copy link
Copy Markdown
Contributor Author

So it looks like the changes merged in #5492 will make the logic a bit more complicated here so will need to do a bit of a think to have these changes work with field time series potentially split across files.

I could do something in spirit to the current design and initially assume fts.times and times in file(s) are aligned compute relevant file and local index on this basis, check if the time recorded in file["timeseries/t"] at the corresponding iteration key matches and if not branch in to a separate chain of logic which replicates the current approach of searching for match in times (but accounting for now needing to search across multiple files).

Alternatively I could have SplitFilePath record edges of time intervals contained in each split file (rather than just cumulative_length=cumsum(iterations_per_part)), as this would allow identifying the relevant file to look for time in without linearly searching through all times in each file in sequence. We could also always try to compute (local) index by matching on time rather than initially assuming aligned times and then falling back to keep logic a bit simpler, but this would probably incur a performance hit for common case of aligned times to allow for what is probably a much rarer case of non-aligned times.

Another alternative would be to avoid the branching by dispatching to different getindex methods based on whether times is aligned with file times or not by somehow recording this in FieldTimeSeries struct in a way this can be dispatched on?

@glwagner
Copy link
Copy Markdown
Member

@simone-silvestri might need your input here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loading FieldTimeSeries with OnDisk backend and specified times gives incorrect time indexing

3 participants