Replies: 1 comment
-
|
Hi @mj-gomes . Interesting, I have never heard about Pooch, but it looks very interesting! I have a few questions:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I discovered pooch, and thought it could be interesting to use as a way to simplify the download and usage of the IMo json files, by getting them directly from the web, at least as a provisory alternative to libinsdb and instrumentdb full capabilities. I just started thinking about this, so it is possible that I am missing something that makes this totally unusable, but the discussion might be interesting.
Basically, I am thinking of something like this:
we would have a pooch_links.py file where we would keep up to date the direct download link (right now it would be the one where I am putting the json files in sdrive.) for each IMo version, together with its hash (pooch allows to include the hash when downloading a data file to ensure that we are using what we want, as the hash will change if the file changes on the source and we'll get an error message). This would, however, mean that every time a new IMo version is available, lbs would have to release a new patch, which is not ideal, but new IMo versions are in general less frequent than lbs patch releases, especially as the instrument configuration reaches its definitive version, so this would work as a dependency update (how bad is this?).
we would have a class with some auxiliary methods to interact seamlessly with the data files by using pooch, which would result in high level methods such as:
-
get_IMo_version_list()which would list every available IMo to be used;-
retrieve_IMo_file(IMo_version)which would download the tarball into cache;-
use_IMo_file(IMo_version)which would download the tarball/get it from the cache, and unzip it (pooch allows decompressing very straightforwardly in all common compressing formats), and return the imo object ready to be used when creating the simulation.With pooch, the files are saved in cache, so there would in principle be no problem with clusters that don't allow downloading inside jobs, as we could do this first in the login node pretty straightforwardly and then use the cached files inside the job.
This could also be used for other things (e.g. input skies if there is some case where it is more useful than having them stored in a cluster).
I believe this would save everyone some time, as we would not need to go to the IMo wiki page -> drive -> download drive -> select the correct path (and deal with relative path problems etc.); and most important than that, it would ensure that we never get the incorrect version of the IMo by mistake, and we would be able to see exactly which version was used in a given script without confusion, just by looking at the argument to e.g.
retrieve_IMo_file(IMo_version). This would also be very easy to implement, I tried playing around with pooch and it looks very simple.What do you think about this? Is this something we could consider or is it not worth it?
Beta Was this translation helpful? Give feedback.
All reactions