API¶

State API¶

class rfpipe.state.State(config=None, sdmfile=None, sdmscan=None, bdfdir=None, inprefs=None, inmeta=None, preffile=None, name=None, showsummary=True, lock=None, validate=True)[source]¶

Defines initial search from preferences and methods. State properties are used to calculate quantities for the search.

Approach: 1) initial preferences can overload state properties 2) read metadata from observation for given scan 3) set state (and a few cached values) 4) state can optionally include attributes for later convenience 5) run search on a segment.

ants¶

beamsize_deg¶: Takes gridding spec to estimate beam size in degrees ** TODO: check for accuracy

blarr¶

blarr_names¶

static calclm(npixx, npixy, uvres, peakx, peaky)[source]¶

candsfile¶: File name to write candidates into

chans¶: List of channel indices to use. Drawn from preferences, with backup to take all defined in metadata.

chunksize¶

clearcache()[source]¶

datashape¶

datashape_orig¶

datasize¶

datasize_orig¶

defined¶

dmarr¶

dmshifts¶: Calculate max DM delay in units of integrations for each dm trial. Gets cached.

dtarr¶

features¶: Given searchtype, return features to be extracted in initial analysis

fftmode¶: Should the FFT be done with fftw or cuda? Could overload here based on local configuration

fieldsize_deg¶: Takes gridding spec to estimate field of view in degrees

fileroot¶

freq¶: Frequencies for each channel in increasing order. Metadata may be out of order, but state/data reading order is sorted.

fringetime¶: Same as fringetime_orig, but rounded to integer multiple of integration time.

fringetime_orig¶: Estimate largest time span of a “segment”. A segment is the maximal time span that can be have a single bg fringe subtracted and uv grid definition. Max fringe window estimated for 5% amp loss at first null averaged over all baselines. Assumes dec=+90, which is conservative. Also can be optimized for memory/compute limits. Returns time in seconds that defines good window.

gainfile¶: Calibration file (telcal) from preferences or found from “.GN” suffix

get_search_ints(segment, dmind, dtind)[source]¶: Helper function to get list of integrations to be searched after correcting for DM and resampling.

get_segmenttime_string(segment)[source]¶

immem¶: Memory required to create all images in a chunk of read integrations

immem_limit¶: Memory required to create all images in a chunk of read integrations Limit defined for all threads.

inttime¶: Integration time

memory_total¶: Total memory (in GB) required to read and process. Depends on data source and search algorithm.

memory_total_limit¶: Minimum memory (in GB) required to read and process. Depends on data source and search algorithm.

mockfile¶: File name to write mocks into

nants¶

nbl¶

nchan¶

nfalse(sigma)[source]¶: Number of thermal-noise false positives per scan at given sigma

nints¶

noisefile¶: File name to write noises into

npixx¶: Number of x pixels in uv/image grid. First defined by input preference set with default to npixx_full

npixx_full¶: Finds optimal uv/image pixel extent in powers of 2 and 3

npixy¶: Number of y pixels in uv/image grid. First defined by input preference set with default to npixy_full

npixy_full¶: Finds optimal uv/image pixel extent in powers of 2 and 3

npol¶: Number of polarization products selected. Cached.

nsegment¶

nspw¶

ntrials¶: Number of search trials per scan

pixtolm(pix)[source]¶: Helper function to calculate (l,m) coords of given pixel. Example: st.pixtolm(np.where(im == im.max()))

pols¶: Polarizations to use based on preference in prefs.selectpol

readints¶: Number of integrations read per segment. Defines shape of numpy array for visibilities.

search_dimensions¶: Define dimensions searched for a given piece of data. Actual algorithm defined in pipeline iteration.

searchints¶: Number of integrations searched per scan

segmenttimes¶: Array of float pairs containing MJD times defining segment start and stop. Calculated from prefs.nsegment first. Alternately, best times found based on fringe time and memory limit

sigma_image1¶: Use either sigma_image1 or nfalse

spw¶: List of spectral windows used. ** TODO: update for proper naming “basband”+”swindex”

summarize()[source]¶: Print summary of pipeline state

t_overlap¶: Max DM delay in seconds that is fixed to int mult of integration time. Gets cached.

t_segment¶: Time read per segment in seconds Assumes first segment is same size as all others.

thresholdlevel(nfalse)[source]¶: Sigma threshold for a given number of false positives per scan

uvres¶

uvres_full¶

validate()[source]¶: Test validity of state (metadata + preferences) with a few assertions

version¶

vismem¶: Memory required to store read data (in GB)

vismem_limit¶: Memory required to store read data (in GB) Limit defined for time range equal to the overlap time between segments.

Preferences API¶

class rfpipe.preferences.Preferences(rfpipe_version='0.9.6', chans=None, spw=None, excludeants=(), selectpol=u'auto', fileroot=None, read_tdownsample=1, read_fdownsample=1, l0=0.0, m0=0.0, timesub=None, flaglist=[(u'badchtslide', 4.0, 10), (u'blstd', 3.0, 0.05)], flagantsol=True, badspwpol=2.0, applyonlineflags=True, gainfile=None, simulated_transient=None, nthread=1, segmenttimes=None, memory_limit=16, maximmem=16, fftmode=u'fftw', dmarr=None, dtarr=None, dm_maxloss=0.05, mindm=0, maxdm=0, dm_pulsewidth=3000, searchtype=u'image1', sigma_image1=7, sigma_image2=None, nfalse=None, uvres=0, npixx=0, npixy=0, npix_max=0, uvoversample=1.0, savenoise=False, savecands=False, candsfile=None, workdir='/Users/caseyjlaw/code/rfpipe', timewindow=30, loglevel=u'INFO')[source]¶

Preferences express half of info needed to define state. Using preferences with metadata produces a unique state and pipeline outcome.

json¶: json string that can be loaded into elasticsearch or hashed.

name¶: Unique name for an instance of preferences. To be used to look up preference set for a given candidate or data set.

ordered¶: Get OrderedDict of preferences sorted by key

Metadata API¶

class rfpipe.metadata.Metadata(datasource=None, datasetId=None, filename=None, scan=None, subscan=None, bdfdir=None, bdfstr=None, source=None, radec=None, inttime=None, nints_=None, telescope=None, starttime_mjd=None, endtime_mjd_=None, dishdiameter=None, intent=None, antids=None, xyz=None, spworder=None, spw_orig=None, spw_nchan=None, spw_reffreq=None, spw_chansize=None, pols_orig=None)[source]¶

Metadata we need to translate parameters into a pipeline state. Called from a function defined for a given source (e.g., an sdm file). Built from nominally immutable attributes and properties. To modify metadata, use attr.assoc(inst, key=newval)

antpos¶

atdefaults()[source]¶: Is metadata still set at default values?

endtime_mjd¶: If nints_ is defined (e.g., for SDM data), then endtime_mjd is calculated. Otherwise (e.g., for scan_config/vys data), it looks for endtime_mjd_ attribute

freq_orig¶: Spacing of channel centers in GHz. Out of order metadata order is sorted in state/data reading

nants_orig¶

nbl_orig¶

nchan_orig¶

nints¶: If endtime_mjd_ is defined (e.g., for scan_config/vys data), then endtime_mjd is calculated. Otherwise (e.g., for SDM data), it looks for nints_ attribute

npol_orig¶

nspw_orig¶

scanId¶

starttime_string¶

uvrange_orig¶

Source API¶

rfpipe.source.data_prep(st, segment, data, flagversion=u'latest')[source]¶: Applies calibration, flags, and subtracts time mean for data. flagversion can be “latest” or “rtpipe”. Optionally prepares data with antenna flags, fixing out of order data, calibration, downsampling, etc..

rfpipe.source.estimate_noiseperbl(data)[source]¶: Takes large data array and sigma clips it to find noise per bl for input to detect_bispectra. Takes mean across pols and channels for now, as in detect_bispectra.

rfpipe.source.flag_badchtslide(data, sigma, win)[source]¶: Use data (4d) to calculate (int, chan, pol) to be flagged

rfpipe.source.flag_blstd(data, sigma, convergence)[source]¶: Use data (4d) to calculate (int, chan, pol) to be flagged.

rfpipe.source.flag_data(st, data)[source]¶: Identifies bad data and flags it to 0.

rfpipe.source.flag_data_rtpipe(st, data)[source]¶: Flagging data in single process Deprecated.

rfpipe.source.generate_transient(st, amp, i0, dm, dt)[source]¶: Create a dynamic spectrum for given parameters amp is in system units (post calibration) i0 is a float for integration relative to start of segment. dm/dt are in units of pc/cm3 and seconds, respectively

rfpipe.source.getsdm(*args, **kwargs)[source]¶: Wrap sdmpy.SDM to get around schema change error

rfpipe.source.prep_standard(st, segment, data)[source]¶: Common first data prep stages, incl online flags, resampling, and mock transients.

rfpipe.source.read_bdf(st, nskip=0)[source]¶: Uses sdmpy to read a given range of integrations from sdm of given scan. readints=0 will read all of bdf (skipping nskip). Returns data in increasing frequency order.

rfpipe.source.read_bdf_segment(st, segment)[source]¶: Uses sdmpy to reads bdf (sdm) format data into numpy array in given segment. Each segment has st.readints integrations.

rfpipe.source.read_segment(st, segment, cfile=None, timeout=10)[source]¶: Read a segment of data. cfile and timeout are specific to vys data. Returns data as defined in metadata (no downselection yet)

rfpipe.source.read_vys_segment(st, seg, cfile=None, timeout=10, offset=4)[source]¶: Read segment seg defined by state st from vys stream. Uses vysmaw application timefilter to receive multicast messages and pull spectra on the CBE. timeout is a multiple of read time in seconds to wait. offset is extra time in seconds to keep vys reader open.

rfpipe.source.save_noise(st, segment, data, chunk=200)[source]¶: Calculates noise properties and save values to pickle. chunk defines window for measurement. at least one measurement always made.

rfpipe.source.sdm_sources(sdmname)[source]¶: Use sdmpy to get all sources and ra,dec per scan as dict

rfpipe.source.simulate_segment(st, loc=0.0, scale=1.0)[source]¶: Simulates visibilities for a segment.

rfpipe.source.slidedev[source]¶: Given a (len x 2) array, calculate the deviation from the median per pol. Calculates median over a window, win.

Search API¶

rfpipe.search.dedisperse(data, delay, parallel=False)[source]¶: Shift data in time (axis=0) by channel-dependent value given in delay. Returns new array with time length shortened by max delay in integrations. wraps _dedisperse to add logging. Can set mode to “single” or “multi” to use different functions.

rfpipe.search.dedisperse_image_cuda(st, segment, data, devicenum=None)[source]¶: Run dedispersion, resample for all dm and dt. Grid and image on GPU. rfgpu is built from separate repo. Uses state to define integrations to image based on segment, dm, and dt. devicenum can force the gpu to use, but can be inferred via distributed.

rfpipe.search.dedisperse_image_fftw(st, segment, data, wisdom=None, integrations=None)[source]¶: Fuse the dediserpse, resample, search, threshold functions.

rfpipe.search.dedisperseresample(data, delay, dt, parallel=False)[source]¶: Dedisperse and resample in single function. parallel controls use of multicore versions of algorithms.

rfpipe.search.grid_image(data, uvw, npixx, npixy, uvres, fftmode, nthread, wisdom=None, integrations=None)[source]¶: Grid and image data. Optionally image integrations in list i. fftmode can be fftw or cuda. nthread is number of threads to use

rfpipe.search.grid_visibilities(data, uvw, npixx, npixy, uvres, parallel=False)[source]¶: Grid visibilities into rounded uv coordinates

rfpipe.search.image_arm()[source]¶: Takes visibilities and images arms of VLA

rfpipe.search.image_cuda()[source]¶: Run grid and image with rfgpu TODO: update to use rfgpu

rfpipe.search.image_fftw(grids, nthread=1, wisdom=None)[source]¶: Plan pyfftw ifft2 and run it on uv grids (time, npixx, npixy) Returns time images.

rfpipe.search.prep_and_search(st, segment, data)[source]¶: Bundles prep and search functions to improve performance in distributed.

rfpipe.search.resample(data, dt, parallel=False)[source]¶: Resample (integrate) by factor dt and return new data structure wraps _resample to add logging. Can set mode to “single” or “multi” to use different functions.

rfpipe.search.search_thresh_fftw(st, segment, data, dmind, dtind, integrations=None, beamnum=0, wisdom=None)[source]¶

Take dedispersed, resampled data, image with fftw and threshold. Returns list of CandData objects that define candidates with candloc, image, and phased visibility data. Integrations can define subset of all available in data to search. Default will take integrations not searched in neighboring segments.

** only supports threshold > image max (no min) ** dmind, dtind, beamnum assumed to represent current state of data

rfpipe.search.set_wisdom(npixx, npixy)[source]¶: Run single 2d ifft like image to prep fftw wisdom in worker cache

Candidates API¶

class rfpipe.candidates.CandCollection(array=array([], dtype=float64), prefs=None, metadata=None)[source]¶

Wrap candidate array with metadata and prefs to be attached and pickled.

canddm¶: Candidate DM in pc/cm3

canddt¶: Candidate dt in seconds

candl¶: Return l1 for candidate (offset from phase center in RA direction)

candm¶: Return m1 for candidate (offset from phase center in Dec direction)

candmjd¶: Candidate MJD at top of band

scan¶

segment¶

state¶: Sets state by regenerating from the metadata and prefs.

class rfpipe.candidates.CandData(state, loc, image, data)[source]¶

Object that bundles data from search stage to candidate visualization. Provides some properties for the state of the phased data and candidate.

peak_lm¶

peak_xy¶: Peak pixel in image Only supports positive peaks for now.

time_top¶: Time in mjd where burst is at top of band