Conversation
|
|
||
| # Stick rigorously with raw_records time range | ||
| return self.chunk(start=start, end=end, data=data, data_type="geant4_interactions") | ||
| return self.chunk(start=start, end=end, data=data)#, data_type="geant4_interactions") |
There was a problem hiding this comment.
I think this is where things are suspicious. How does the output records chunk look like? Are the chunks crazily huge? @jjakob03
There was a problem hiding this comment.
It would be nice if you can direct me to where your records/raw_records_simu is
There was a problem hiding this comment.
I changed this because fuse did drop it as well in this PR. Do you think we need it here?
There was a problem hiding this comment.
Honestly I am not sure if it is necessary and removing it will break things. I am just a bit nervous since this is chunk related, and chunk size does matter hugely for memory consumption. I see that your memory requested is already 55GB. I would say 70GB is the maximum we can take. Maybe give it a last try with 70GB?
Also I would want to see the raw_records_simu and records files. Is it somewhere on midway?
There was a problem hiding this comment.
No sorry I keep them in my scratch. records chunk size is ~1.2GB each with 72 chunks in total and, raw_records_simu chunk size is ~300MB and also 72 chunks in total. Is that helpful?
Yes I would have already gone for more memory but broadwl won't give me more than 58GB and xenon1t partition can't see project and midway3/scratch.
There was a problem hiding this comment.
No sorry I keep them in my scratch. records chunk size is ~1.2GB each with 72 chunks in total and, raw_records_simu chunk size is ~300MB and also 72 chunks in total. Is that helpful?
This is helpful. Maybe try lgrandi on midway3. Note that you will have to reinstall things... which gonna be a painful 5 minutes
There was a problem hiding this comment.
From the info you gave I don't think this is where things break. Maybe it is just the AmBe super heavy nature itself + some memory leakage. We never figured out why peak building is so expensive in memory.
There was a problem hiding this comment.
By the way, it worked with 65GB on lgrandi. Also, I wasn't aware that the one run I tested with is a 20min AmBe run from topCW5d9m position, while the topCW7d8m ones are only 10 minutes. So probably this is really just super heavy AmBe 😐
There was a problem hiding this comment.
Thanks I think it makes a lot of sense to me! Yeah this is how difficult things are for AmBe computing..
Draft PR for debug purpose