API

State API

class rfpipe.state.State(config=None, sdmfile=None, sdmscan=None, bdfdir=None, inprefs=None, inmeta=None, preffile=None, name=None, showsummary=True, lock=None, validate=True)[source]

Defines initial search from preferences and methods. State properties are used to calculate quantities for the search.

Approach: 1) initial preferences can overload state properties 2) read metadata from observation for given scan 3) set state (and a few cached values) 4) state can optionally include attributes for later convenience 5) run search on a segment.

ants
beamsize_deg

Takes gridding spec to estimate beam size in degrees ** TODO: check for accuracy

blarr
blarr_names
static calclm(npixx, npixy, uvres, peakx, peaky)[source]
candsfile

File name to write candidates into

chans

List of channel indices to use. Drawn from preferences, with backup to take all defined in metadata.

chunksize
clearcache()[source]
datashape
datashape_orig
datasize
datasize_orig
defined
dmarr
dmshifts

Calculate max DM delay in units of integrations for each dm trial. Gets cached.

dtarr
features

Given searchtype, return features to be extracted in initial analysis

fftmode

Should the FFT be done with fftw or cuda? Could overload here based on local configuration

fieldsize_deg

Takes gridding spec to estimate field of view in degrees

fileroot
freq

Frequencies for each channel in increasing order. Metadata may be out of order, but state/data reading order is sorted.

fringetime

Same as fringetime_orig, but rounded to integer multiple of integration time.

fringetime_orig

Estimate largest time span of a “segment”. A segment is the maximal time span that can be have a single bg fringe subtracted and uv grid definition. Max fringe window estimated for 5% amp loss at first null averaged over all baselines. Assumes dec=+90, which is conservative. Also can be optimized for memory/compute limits. Returns time in seconds that defines good window.

gainfile

Calibration file (telcal) from preferences or found from “.GN” suffix

get_search_ints(segment, dmind, dtind)[source]

Helper function to get list of integrations to be searched after correcting for DM and resampling.

get_segmenttime_string(segment)[source]
immem

Memory required to create all images in a chunk of read integrations

immem_limit

Memory required to create all images in a chunk of read integrations Limit defined for all threads.

inttime

Integration time

memory_total

Total memory (in GB) required to read and process. Depends on data source and search algorithm.

memory_total_limit

Minimum memory (in GB) required to read and process. Depends on data source and search algorithm.

mockfile

File name to write mocks into

nants
nbl
nchan
nfalse(sigma)[source]

Number of thermal-noise false positives per scan at given sigma

nints
noisefile

File name to write noises into

npixx

Number of x pixels in uv/image grid. First defined by input preference set with default to npixx_full

npixx_full

Finds optimal uv/image pixel extent in powers of 2 and 3

npixy

Number of y pixels in uv/image grid. First defined by input preference set with default to npixy_full

npixy_full

Finds optimal uv/image pixel extent in powers of 2 and 3

npol

Number of polarization products selected. Cached.

nsegment
nspw
ntrials

Number of search trials per scan

pixtolm(pix)[source]

Helper function to calculate (l,m) coords of given pixel. Example: st.pixtolm(np.where(im == im.max()))

pols

Polarizations to use based on preference in prefs.selectpol

readints

Number of integrations read per segment. Defines shape of numpy array for visibilities.

search_dimensions

Define dimensions searched for a given piece of data. Actual algorithm defined in pipeline iteration.

searchints

Number of integrations searched per scan

segmenttimes

Array of float pairs containing MJD times defining segment start and stop. Calculated from prefs.nsegment first. Alternately, best times found based on fringe time and memory limit

sigma_image1

Use either sigma_image1 or nfalse

spw

List of spectral windows used. ** TODO: update for proper naming “basband”+”swindex”

summarize()[source]

Print summary of pipeline state

t_overlap

Max DM delay in seconds that is fixed to int mult of integration time. Gets cached.

t_segment

Time read per segment in seconds Assumes first segment is same size as all others.

thresholdlevel(nfalse)[source]

Sigma threshold for a given number of false positives per scan

uvres
uvres_full
validate()[source]

Test validity of state (metadata + preferences) with a few assertions

version
vismem

Memory required to store read data (in GB)

vismem_limit

Memory required to store read data (in GB) Limit defined for time range equal to the overlap time between segments.

Preferences API

class rfpipe.preferences.Preferences(rfpipe_version='0.9.6', chans=None, spw=None, excludeants=(), selectpol=u'auto', fileroot=None, read_tdownsample=1, read_fdownsample=1, l0=0.0, m0=0.0, timesub=None, flaglist=[(u'badchtslide', 4.0, 10), (u'blstd', 3.0, 0.05)], flagantsol=True, badspwpol=2.0, applyonlineflags=True, gainfile=None, simulated_transient=None, nthread=1, segmenttimes=None, memory_limit=16, maximmem=16, fftmode=u'fftw', dmarr=None, dtarr=None, dm_maxloss=0.05, mindm=0, maxdm=0, dm_pulsewidth=3000, searchtype=u'image1', sigma_image1=7, sigma_image2=None, nfalse=None, uvres=0, npixx=0, npixy=0, npix_max=0, uvoversample=1.0, savenoise=False, savecands=False, candsfile=None, workdir='/Users/caseyjlaw/code/rfpipe', timewindow=30, loglevel=u'INFO')[source]

Preferences express half of info needed to define state. Using preferences with metadata produces a unique state and pipeline outcome.

json

json string that can be loaded into elasticsearch or hashed.

name

Unique name for an instance of preferences. To be used to look up preference set for a given candidate or data set.

ordered

Get OrderedDict of preferences sorted by key

Metadata API

class rfpipe.metadata.Metadata(datasource=None, datasetId=None, filename=None, scan=None, subscan=None, bdfdir=None, bdfstr=None, source=None, radec=None, inttime=None, nints_=None, telescope=None, starttime_mjd=None, endtime_mjd_=None, dishdiameter=None, intent=None, antids=None, xyz=None, spworder=None, spw_orig=None, spw_nchan=None, spw_reffreq=None, spw_chansize=None, pols_orig=None)[source]

Metadata we need to translate parameters into a pipeline state. Called from a function defined for a given source (e.g., an sdm file). Built from nominally immutable attributes and properties. To modify metadata, use attr.assoc(inst, key=newval)

antpos
atdefaults()[source]

Is metadata still set at default values?

endtime_mjd

If nints_ is defined (e.g., for SDM data), then endtime_mjd is calculated. Otherwise (e.g., for scan_config/vys data), it looks for endtime_mjd_ attribute

freq_orig

Spacing of channel centers in GHz. Out of order metadata order is sorted in state/data reading

nants_orig
nbl_orig
nchan_orig
nints

If endtime_mjd_ is defined (e.g., for scan_config/vys data), then endtime_mjd is calculated. Otherwise (e.g., for SDM data), it looks for nints_ attribute

npol_orig
nspw_orig
scanId
starttime_string
uvrange_orig

Source API

rfpipe.source.data_prep(st, segment, data, flagversion=u'latest')[source]

Applies calibration, flags, and subtracts time mean for data. flagversion can be “latest” or “rtpipe”. Optionally prepares data with antenna flags, fixing out of order data, calibration, downsampling, etc..

rfpipe.source.estimate_noiseperbl(data)[source]

Takes large data array and sigma clips it to find noise per bl for input to detect_bispectra. Takes mean across pols and channels for now, as in detect_bispectra.

rfpipe.source.flag_badchtslide(data, sigma, win)[source]

Use data (4d) to calculate (int, chan, pol) to be flagged

rfpipe.source.flag_blstd(data, sigma, convergence)[source]

Use data (4d) to calculate (int, chan, pol) to be flagged.

rfpipe.source.flag_data(st, data)[source]

Identifies bad data and flags it to 0.

rfpipe.source.flag_data_rtpipe(st, data)[source]

Flagging data in single process Deprecated.

rfpipe.source.generate_transient(st, amp, i0, dm, dt)[source]

Create a dynamic spectrum for given parameters amp is in system units (post calibration) i0 is a float for integration relative to start of segment. dm/dt are in units of pc/cm3 and seconds, respectively

rfpipe.source.getsdm(*args, **kwargs)[source]

Wrap sdmpy.SDM to get around schema change error

rfpipe.source.prep_standard(st, segment, data)[source]

Common first data prep stages, incl online flags, resampling, and mock transients.

rfpipe.source.read_bdf(st, nskip=0)[source]

Uses sdmpy to read a given range of integrations from sdm of given scan. readints=0 will read all of bdf (skipping nskip). Returns data in increasing frequency order.

rfpipe.source.read_bdf_segment(st, segment)[source]

Uses sdmpy to reads bdf (sdm) format data into numpy array in given segment. Each segment has st.readints integrations.

rfpipe.source.read_segment(st, segment, cfile=None, timeout=10)[source]

Read a segment of data. cfile and timeout are specific to vys data. Returns data as defined in metadata (no downselection yet)

rfpipe.source.read_vys_segment(st, seg, cfile=None, timeout=10, offset=4)[source]

Read segment seg defined by state st from vys stream. Uses vysmaw application timefilter to receive multicast messages and pull spectra on the CBE. timeout is a multiple of read time in seconds to wait. offset is extra time in seconds to keep vys reader open.

rfpipe.source.save_noise(st, segment, data, chunk=200)[source]

Calculates noise properties and save values to pickle. chunk defines window for measurement. at least one measurement always made.

rfpipe.source.sdm_sources(sdmname)[source]

Use sdmpy to get all sources and ra,dec per scan as dict

rfpipe.source.simulate_segment(st, loc=0.0, scale=1.0)[source]

Simulates visibilities for a segment.

rfpipe.source.slidedev[source]

Given a (len x 2) array, calculate the deviation from the median per pol. Calculates median over a window, win.

Search API

rfpipe.search.dedisperse(data, delay, parallel=False)[source]

Shift data in time (axis=0) by channel-dependent value given in delay. Returns new array with time length shortened by max delay in integrations. wraps _dedisperse to add logging. Can set mode to “single” or “multi” to use different functions.

rfpipe.search.dedisperse_image_cuda(st, segment, data, devicenum=None)[source]

Run dedispersion, resample for all dm and dt. Grid and image on GPU. rfgpu is built from separate repo. Uses state to define integrations to image based on segment, dm, and dt. devicenum can force the gpu to use, but can be inferred via distributed.

rfpipe.search.dedisperse_image_fftw(st, segment, data, wisdom=None, integrations=None)[source]

Fuse the dediserpse, resample, search, threshold functions.

rfpipe.search.dedisperseresample(data, delay, dt, parallel=False)[source]

Dedisperse and resample in single function. parallel controls use of multicore versions of algorithms.

rfpipe.search.grid_image(data, uvw, npixx, npixy, uvres, fftmode, nthread, wisdom=None, integrations=None)[source]

Grid and image data. Optionally image integrations in list i. fftmode can be fftw or cuda. nthread is number of threads to use

rfpipe.search.grid_visibilities(data, uvw, npixx, npixy, uvres, parallel=False)[source]

Grid visibilities into rounded uv coordinates

rfpipe.search.image_arm()[source]

Takes visibilities and images arms of VLA

rfpipe.search.image_cuda()[source]

Run grid and image with rfgpu TODO: update to use rfgpu

rfpipe.search.image_fftw(grids, nthread=1, wisdom=None)[source]

Plan pyfftw ifft2 and run it on uv grids (time, npixx, npixy) Returns time images.

Bundles prep and search functions to improve performance in distributed.

rfpipe.search.resample(data, dt, parallel=False)[source]

Resample (integrate) by factor dt and return new data structure wraps _resample to add logging. Can set mode to “single” or “multi” to use different functions.

rfpipe.search.search_thresh_fftw(st, segment, data, dmind, dtind, integrations=None, beamnum=0, wisdom=None)[source]

Take dedispersed, resampled data, image with fftw and threshold. Returns list of CandData objects that define candidates with candloc, image, and phased visibility data. Integrations can define subset of all available in data to search. Default will take integrations not searched in neighboring segments.

** only supports threshold > image max (no min) ** dmind, dtind, beamnum assumed to represent current state of data

rfpipe.search.set_wisdom(npixx, npixy)[source]

Run single 2d ifft like image to prep fftw wisdom in worker cache

Candidates API

class rfpipe.candidates.CandCollection(array=array([], dtype=float64), prefs=None, metadata=None)[source]

Wrap candidate array with metadata and prefs to be attached and pickled.

canddm

Candidate DM in pc/cm3

canddt

Candidate dt in seconds

candl

Return l1 for candidate (offset from phase center in RA direction)

candm

Return m1 for candidate (offset from phase center in Dec direction)

candmjd

Candidate MJD at top of band

scan
segment
state

Sets state by regenerating from the metadata and prefs.

class rfpipe.candidates.CandData(state, loc, image, data)[source]

Object that bundles data from search stage to candidate visualization. Provides some properties for the state of the phased data and candidate.

peak_lm
peak_xy

Peak pixel in image Only supports positive peaks for now.

time_top

Time in mjd where burst is at top of band