Skip to content

read, join, and write MS #272

@caseyjlaw

Description

@caseyjlaw
  • dask-ms version: 0.2.6
  • Python version: 3.8
  • Operating System: RHEL 8.6

Description

I'd like to read multiple MS files and write one that joins along spectral window. Each MS file represents a single integration with a different spectral window and data have identical structure in each MS.

What I Did

I tried to follow the suggestion in the docs, but I may have an issue with the MS.

> ds = xds_from_ms('20221129_034113_15MHz.ms:SPECTRAL_WINDOW', group_cols="__row__")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-75-b6c8fc404ff8> in <module>
----> 1 ds = xds_from_ms('20221129_034113_15MHz.ms:SPECTRAL_WINDOW', group_cols="__row__")

~/.conda/envs/development/lib/python3.6/site-packages/daskms/dask_ms.py in xds_from_ms(ms, columns, index_cols, group_cols, **kwargs)
    397                           index_cols=index_cols,
    398                           group_cols=group_cols,
--> 399                           **kwargs)
    400 
    401 

~/.conda/envs/development/lib/python3.6/site-packages/daskms/dask_ms.py in xds_from_table(table_name, columns, index_cols, group_cols, **kwargs)
    324     dask_datasets = DatasetFactory(table_name, columns,
    325                                    group_cols, index_cols,
--> 326                                    **kwargs).datasets()
    327 
    328     # Return dask datasets if xarray is not available

~/.conda/envs/development/lib/python3.6/site-packages/daskms/reads.py in __init__(self, table, select_cols, group_cols, index_cols, **kwargs)
    279     def __init__(self, table, select_cols, group_cols, index_cols, **kwargs):
    280         if not table_exists(table):
--> 281             raise ValueError("'%s' does not appear to be a CASA Table" % table)
    282 
    283         chunks = kwargs.pop('chunks', [{'row': _DEFAULT_ROW_CHUNKS}])

ValueError: '20221129_034113_15MHz.ms:SPECTRAL_WINDOW' does not appear to be a CASA Table

However, the file can be read at some level:

> ds = xds_from_ms('20221129_034113_15MHz.ms', group_cols=["FIELD_ID", "DATA_DESC_ID"])
> ds
[<xarray.Dataset>
 Dimensions:         (chan: 192, corr: 4, flagcat: 1, row: 62128, uvw: 3)
 Coordinates:
     ROWID           (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
 Dimensions without coordinates: chan, corr, flagcat, row, uvw
 Data variables:
     ANTENNA1        (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     ANTENNA2        (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     ARRAY_ID        (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     DATA            (row, chan, corr) complex64 dask.array<chunksize=(10000, 192, 4), meta=np.ndarray>
     EXPOSURE        (row) float64 dask.array<chunksize=(10000,), meta=np.ndarray>
     FEED1           (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     FEED2           (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     FLAG            (row, chan, corr) bool dask.array<chunksize=(10000, 192, 4), meta=np.ndarray>
     FLAG_CATEGORY   (row, flagcat, chan, corr) bool dask.array<chunksize=(10000, 1, 192, 4), meta=np.ndarray>
     FLAG_ROW        (row) bool dask.array<chunksize=(10000,), meta=np.ndarray>
     INTERVAL        (row) float64 dask.array<chunksize=(10000,), meta=np.ndarray>
     OBSERVATION_ID  (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     PROCESSOR_ID    (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     SCAN_NUMBER     (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     SIGMA           (row, corr) float32 dask.array<chunksize=(10000, 4), meta=np.ndarray>
     STATE_ID        (row) int32 dask.array<chunksize=(10000,), meta=np.ndarray>
     TIME            (row) float64 dask.array<chunksize=(10000,), meta=np.ndarray>
     TIME_CENTROID   (row) float64 dask.array<chunksize=(10000,), meta=np.ndarray>
     UVW             (row, uvw) float64 dask.array<chunksize=(10000, 3), meta=np.ndarray>
     WEIGHT          (row, corr) float32 dask.array<chunksize=(10000, 4), meta=np.ndarray>
 Attributes:
     FIELD_ID:      0
     DATA_DESC_ID:  0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions