Data Product Translation Properties #535
Replies: 2 comments 6 replies
-
Why is this to be expected? This seems to be an important assumption for what you wrote following this, but I do not understand why this should be expected. |
Beta Was this translation helpful? Give feedback.
-
|
From the point of view of integrating WCT, I see two main issues touching on "translators".
First is a Phlex problem. Second is mostly a DUNE problem but if Phlex can provide things to help it, that's certainly welcome. Here's what I mean: Eager dropping example is ADC waveforms produce either from DAQ file input or from WCT sim. ADC waveforms are a particular source of memory pressure while also having very specific consumers and so are particularly ripe for some "eager drop" feature. Once ADCs are consumed, having Phlex keep them alive is an unwanted memory burden. There are perhaps data model factoring issues here: ADC waveform data would have fewer consumers than the metadata the comes with the waveforms. Combinatoric scaling will occur if we must pair every relevant type of producer with every relevant type of consumer for a given conceptual data type. In the ADC waveform example there are two producers: DAQ HFD5 files and WCT sim. Human nature being as it is, each will produce a unique data type (if we do not force otherwise). There is almost just one consumer which is WCT sigproc. However, special-purpose consumers will exist, eg trigger studies algs. Assuming DAQ HDF5 files produce a data product called The type of a The picture explodes as we add more types of consumers (or producers). It is largely up to the mythical DUNE data model to simplify this complexity by defining a reduced set of "data products" in the way LArSoft did. At best, this is the netpbm/pandoc solution of decomposing M-to-N into M-to-1-to-N translators. Implementing these as well factored, loosely coupled packages will mean many packages and carefully drawn dependency lines. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Introduction
This discussion started on ZOOM with the following participants:
Brief
We will be continuing discussion and planning of the Data Product Translation subsystem. Specifically, Kyle has asked that we develop a list of properties, and provided some proposals as a seed:
As a way of firing up our thinking in this direction, please—if you have time:
Notes
2026-04-17
Discussion continuation 2026-04-24
Marc presented some slides for discussion, laying out some of his thoughts on data product concepts, translators, and graph construction.
Discussion continuation 2026-05-01
Continuing with point 3 from above:
Translators may need to propagate (a reference to) the original data product for object-ownership purposes
Translators should not be duplicated if the same translation is required by multiple computational nodes
More translation subsystem properties:
Discovery and selection of translation plugins. Use case: a developer relying mostly on production-built translators, but possibly overriding/adding a small number of translators for their own needs.
Further properties to be suggested/discussed in the comments below by end of business Friday 2026-05-08, and collected/summarized by @greenc-FNAL for discussion early the following week.
Beta Was this translation helpful? Give feedback.
All reactions