Skip to end of metadata
Go to start of metadata

Recombining means converting legacy VCS data from one format to another. The two formats are both binary formats, with each byte representing a (4+4)-bit complex signed sample. The main difference between the formats is the ordering of the bytes and how they are distributed between multiple files. For the purposes of this document, the "from" format will be called the PFB or unrecombined format, and the "to" format will be called the VCS or recombined format. The format conversion is necessary because the beamformer only (currently) supports the VCS format as input.

Software

The primary software for recombining is found in the mwa-voltage repository. This has recently been forked to its own dedicated recombine repo in order to promote further development/maintenance, but as of this writing (2022-06-01) is identical (in functionality and usage) to mwa-voltage. In either case, the name of the exectuable is recombine. On Garrawarla, it is provided by the mwa-voltage and recombine modules. Future developments will be made available through the recombine module, but the mwa-voltage module will always remain available for compatibility with historical pipelines.

Usage

(See mwa-voltage and recombine for the most up-to-date usage documentation)

recombine -o <obsid> -t <secondid> -m <meta-data fits> -i <output dir> 
          -c <skip course chan> -s <skip ICS> -f <file list> or -g <input file list>

<obsid>: observation id of the data being processed.
<secondid>: the second which is being processed
<meta-data fits>: meta-data fits file containing tile flag information and various orther useful 
information regarding the observation. To obtain the meta-data fits file for a particular observation 
use the following: 
     wget -O <obsid>.metafits http://ws.mwatelescope.org/metadata/fits?obs_id=<obsid>
<output dir>: output product directory
<skip course chan>: 1 will skip the generation of the recombined course channel data
<skip ICS>: 1 will skip the generation of the incoherent sum
<input file list>: location of 32 raw uncombined input files for a single seconds worth of data (separate each with a space)
<input file list>: a file containing the location of the 32 raw uncombined input files (separate by newline)

Processing on Garrawarla

Examples of using recombine to process both single-second and multiple-second jobs on Garrawarla is now provided as part of the documentation included in the recombine repository.

Other (wrapper) scripts

vcs_download.nf

vcs_download.nf is a Nextflow script provided by the mwa_search repo. Its use is described on the main Documentation page. As a quick reference, however, the following template can be followed on Garrawarla:

module load vcstools/devel
module load mwa_search/devel
vcs_download.nf --obsid <obsid> --begin <begin gps> --end <end gps> --download_dir /astro/mwavcs/asvo/<ASVO ID> --vcstools_version devel -resume

Note that vcs_download.nf will remove the PFB files once they have been successfully recombined.

process_vcs.py (deprecated)

process_vcs.py and checks.py are provided as part of VCSTools (vcstools module on Garrwarla). This (among many other things) is a wrapper for doing recombine on the GPU cluster ("gpuq") on Galaxy.

To recombine all of the data, use

process_vcs.py -m recombine -o <obs ID> -a

or, for only a subset of data, use

process_vcs.py -m recombine -o <obs ID> -b <starting GPS second> -e <end GPS second>

If you want to see the progress, then use:

squeue -p gpuq -u $USER

Generally, this processing should not take too long, typically ~few hours.

Checking the recombined data

It is a good idea to check at this stage to make sure that all of the data were recombined properly. To do this, use:

checks.py -m recombine -o <obs ID>

This will check that there are all the recombined files are present and of the correct size. If there are missing raw files the recombining process will make zero-padded files and leave gaps in your data. If you would like to do a more robust check, beamform and splice the data (using the following steps) and then run:

prepdata -o recombine_test -nobary -dm 0 <fits files>

Then you can look through the produced .dat file for gaps using:

exploredat <.dat file>

Once you are happy that the data have been recombined correctly then you should delete the raw voltages (as they are no longer used in the pipeline and are a massive drain on storage resources).

Planned future developments for recombine

  • Add GPU support
  • Improve CLI interface

Description of PFB format

TO DO...

  • No labels