Input/Output Module

python binding for in/output of particle positions (part of CatAna)

The catana.io module contains functions and classes to read data from files and containers, filters that can be applied to objects (points) and facilites to write those points to files and containers. The implementation follows a “data-stream” concept, where the data flows from a source through one or multiple filters to a so-called sink.

The available sources, sinks and filters are listed below. The data-stream pipeline is implemented in the FilterStream class. If only reading from a file to a PointContainer or PixelizedPointContainer is required you can directly use the Source.get_point_container and Source.get_pixelized_point_container functions.

Sources

Source

class catana.io.Source

(Abstract) parent class for Source implementations.

CartesianTextSource

class catana.io.CartesianTextSource(self: catana.io.CartesianTextSource, filename: str, shift: float=0.0) → None

Read positions from Text File (cartesian coordinates)

Only the first 3 columns are read, remaining columns are ignored.

Note

To convert box coordinates to spherical coordinates, the origin is placed at [0, 0, 0]. If the coordinates are not centered at 0 they can be moved by using the shift parameter (e.g. if the coordinates are within [0, L], use shift=-L/2)

Constructor from a textfile filename.

Parameters:
  • filename (str) – path to file containing the positions
  • shift (float) – a coordinate shift that is applied to each axis
get_pixelized_point_container(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer

Load all data into a PixelizedPointContainer with given NSide

Parameters:nside (int) – HEALPix pixelization resolution, must be a power of 2
Returns:PixelizedPointContainer – A pixelized point container with all the data
get_point_container(self: catana.io.Source) → catana.basictypes.PointContainer

Load all data into an PointContainer.

Returns:PointContainer – A PointContainer with all data
reset(self: catana.io.Source) → None

Reset the source

This reverts the Source to the initial state and reading from it again should produce the same results

SphericalTextSource

class catana.io.SphericalTextSource(self: catana.io.SphericalTextSource, filename: str) → None

Read from Text File (spherical coordinates [r, theta, phi])

Only the first 3 columns are read, remaining columns are ignored.

Constructor from a textfile filename.

Parameters:filename (str) – path to file containing the positions

Note

For 3dex compatible text files with columns [theta, phi, r], use the SphericalTextSource_3dex class with the same functionality and syntax.

get_pixelized_point_container(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer

Load all data into a PixelizedPointContainer with given NSide

Parameters:nside (int) – HEALPix pixelization resolution, must be a power of 2
Returns:PixelizedPointContainer – A pixelized point container with all the data
get_point_container(self: catana.io.Source) → catana.basictypes.PointContainer

Load all data into an PointContainer.

Returns:PointContainer – A PointContainer with all data
reset(self: catana.io.Source) → None

Reset the source

This reverts the Source to the initial state and reading from it again should produce the same results

GadgetSource

class catana.io.GadgetSource(self: catana.io.GadgetSource, filename: str, verbose: bool=True) → None

Read positions from binary Gadget file

This reader will only read particles of type 1 from version 2 binary Gadget files.

Constructor from a Gadget filename.

Parameters:
  • filename (str) – path to the Gadget file
  • verbose (bool) – if True, print logging information to stdout (might be in terminal for jupyter notebooks!)
get_pixelized_point_container(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer

Load all data into a PixelizedPointContainer with given NSide

Parameters:nside (int) – HEALPix pixelization resolution, must be a power of 2
Returns:PixelizedPointContainer – A pixelized point container with all the data
get_point_container(self: catana.io.Source) → catana.basictypes.PointContainer

Load all data into an PointContainer.

Returns:PointContainer – A PointContainer with all data
reset(self: catana.io.Source) → None

Reset the source

This reverts the Source to the initial state and reading from it again should produce the same results

PointContainerSource

class catana.io.PointContainerSource(self: catana.io.PointContainerSource, point_container: catana.basictypes.PointContainer) → None

Read positions from a PointContainer

Mainly used to add a PointContainer to a FilterStream.

Constructor from an existing PointContainer.

Parameters:point_container (PointContainer) – the PointContainer from which to read
get_pixelized_point_container(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer

Load all data into a PixelizedPointContainer with given NSide

Parameters:nside (int) – HEALPix pixelization resolution, must be a power of 2
Returns:PixelizedPointContainer – A pixelized point container with all the data
get_point_container(self: catana.io.Source) → catana.basictypes.PointContainer

Load all data into an PointContainer.

Returns:PointContainer – A PointContainer with all data
reset(self: catana.io.Source) → None

Reset the source

This reverts the Source to the initial state and reading from it again should produce the same results

RawBinarySource

class catana.io.RawBinarySource(self: catana.io.RawBinarySource, filename: str, verbose: bool=True) → None

Read from CatAna binary file (float precision, spherical coordinates)

Constructor from a binary file filename.

Parameters:
  • filename (str) – path to the binary file
  • verbose (bool) – if True, print logging information to stdout (might be in terminal for jupyter notebooks!)
get_pixelized_point_container(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer

Load all data into a PixelizedPointContainer with given NSide

Parameters:nside (int) – HEALPix pixelization resolution, must be a power of 2
Returns:PixelizedPointContainer – A pixelized point container with all the data
get_point_container(self: catana.io.Source) → catana.basictypes.PointContainer

Load all data into an PointContainer.

Returns:PointContainer – A PointContainer with all data
reset(self: catana.io.Source) → None

Reset the source

This reverts the Source to the initial state and reading from it again should produce the same results


Sinks

Sink

class catana.io.Sink

(Abstract) parent class for Sink implementations

These are used to write data of type Point to various targets.

CartesianTextSink

class catana.io.CartesianTextSink(self: catana.io.CartesianTextSink, filename: str, verbose: bool=True) → None

A Sink to write Point data into a text-file using cartesian coordinates

Warning

The file will be truncated if it already exists

Create Sink text-file

Parameters:
  • filename (str) – path to file. Will be overwritten if it already exists.
  • verbose (bool) – output additional logging information if True.
write(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int

write entire point_container directly to sink

Parameters:point_container (PointContainer) – source container to write
Returns:int – number of points written. -1 if it failed

SphericalTextSink

class catana.io.SphericalTextSink(self: catana.io.SphericalTextSink, filename: str, verbose: bool=True) → None

A Sink to write Point data into a text-file using spherical coordinates [r, theta, phi]

Warning

The file will be truncated if it already exists

Create Sink text-file

Parameters:
  • filename (str) – path to file. Will be overwritten if it already exists.
  • verbose (bool) – output additional logging information if True.

Note

For 3dex compatible text files with columns [theta, phi, r], use the SphericalTextSink_3dex class with the same functionality and syntax.

write(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int

write entire point_container directly to sink

Parameters:point_container (PointContainer) – source container to write
Returns:int – number of points written. -1 if it failed

PointContainerSink

class catana.io.PointContainerSink(self: catana.io.PointContainerSink, point_container: catana.basictypes.PointContainer) → None

A Sink to write Point data into a PointContainer

Constructor from an existing PointContainer

Parameters:point_container (PointContainer) – container to write the points to
write(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int

write entire point_container directly to sink

Parameters:point_container (PointContainer) – source container to write
Returns:int – number of points written. -1 if it failed

RawBinarySink

class catana.io.RawBinarySink(self: catana.io.RawBinarySink, filename: str, verbose: bool=True, append: bool=False) → None

A Sink to write Point data into a CatAna binary file (float precision, spherical coordinates)

Create Sink binary-file

Parameters:
  • filename (str) – path to file. Will be overwritten if it already exists.
  • verbose (bool) – output additional logging information if True.
  • append (bool) – if True, Points will be appended to the file. If False, file will be truncated.
write(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int

write entire point_container directly to sink

Parameters:point_container (PointContainer) – source container to write
Returns:int – number of points written. -1 if it failed

Filters

Filter (abstract class)

class catana.io.Filter

(Abstract) parent class for point-filtering implementations.

TophatRadialWindowFunctionFilter

class catana.io.TophatRadialWindowFunctionFilter(self: catana.io.TophatRadialWindowFunctionFilter, R0: float) → None

TopHat Radial WindowFunction Filter

Constructs a tophat filter with radius R0

Parameters:R0 (float) – radius of tophat
__call__(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int

Apply filter to a PointContainer

Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.

Parameters:point_container (PointContainer) – the container on which the filter is applied.

GaussianRadialWindowFunctionFilter

class catana.io.GaussianRadialWindowFunctionFilter(self: catana.io.GaussianRadialWindowFunctionFilter, R0: float) → None

Gaussian Radial WindowFunction Filter

The acceptance probability is p = exp(-(r/R0)^2)

Constructs a gaussian radial filter with scale factor R0

Parameters:R0 (float) – scale factor
__call__(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int

Apply filter to a PointContainer

Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.

Parameters:point_container (PointContainer) – the container on which the filter is applied.

GenericRadialWindowFunctionFilter

class catana.io.GenericRadialWindowFunctionFilter(*args, **kwargs)

Generic Radial WindowFunction Filter

This filter accepts a generic function f: [0, rmax] -> [0, 1] where the function value is the probability that the point with the given radius is accepted.

Note

Using this functionality in Python causes many callback function which are slow. To speed up filtering with a generic radial probability function, use the constructor with interpolation_points in the interval [min, max].

Overloaded function.

  1. __init__(self: catana.io.GenericRadialWindowFunctionFilter, function: std::function<double (double)>) -> None

    Constructor from a python callable

    This will cause callbacks which can be very slow for a large Source!

    Parameters:

    function (callable): a function returning the acceptance probability for a given radius

  2. __init__(self: catana.io.GenericRadialWindowFunctionFilter, function: std::function<double (double)>, interpolation_points: int, min: float, max: float) -> None

    Constructor from a python callable, with interpolation

    Parameters:
    function (callable): a function returning the acceptance probability for a given radius. should be

    sufficiently smooth!

    interpolation_points (int): number of sampling points which will be equidistantly distributed between

    [min, max]

    min (float): lower-bound for interpolation range, should be smaller or equal to the shortest radius to

    which the filter is applied

    max (float): upper-bound for interpolation range, should be larger or equal to the longest radius to

    which the filter is applied

__call__(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int

Apply filter to a PointContainer

Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.

Parameters:point_container (PointContainer) – the container on which the filter is applied.

AngularMaskFilter

class catana.io.AngularMaskFilter(*args, **kwargs)

HEALPix Angular Mask Filter

Uses a HEALPix map, where each pixel should be either 1 or 0, where 1 corresponds to keep the point if its angular coordinates are within the pixel and 0 will remove it.

Overloaded function.

  1. __init__(self: catana.io.AngularMaskFilter, mask_file: str) -> None

Constructor from a HEALPix FITS file

Parameters:mask_file (str) – path to the HEALPix mask map.
  1. __init__(self: catana.io.AngularMaskFilter, map: numpy.ndarray[float32[m, 1]]) -> None

Constructor from a HEALPix map array

Parameters:map (numpy.ndarray[float32]) – valid HEALPix map, in RING format, containing only 1 and 0 pixel values
__call__(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int

Apply filter to a PointContainer

Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.

Parameters:point_container (PointContainer) – the container on which the filter is applied.

FilterStream

To combine a Source -> Filter -> Sink together, we can use a FilterStream

class catana.io.FilterStream(self: catana.io.FilterStream, source: catana.io.Source, sink: catana.io.Sink, buffer_size: int=1000000, verbose: bool=True) → None

A Stream which connects a Source to a Sink and applies filters (Filter).

The FilterStream class is used to move points (Point) from the source to the sink while applying (multiple) filters and buffering the stream to speed up the transfer. Sources, Sinks and Filters can be any objects derrived from the Source , Sink and Filter parent classes.

Constructor

Parameters:
  • source (Source) – from where the points will be read
  • sink (Sink) – to where the filtered points are stored
  • buffer_size (int) – size of the buffer (in number of points). Faster if larger, but consumes more memroy.
  • verbose (bool) – if True, print additional logging information to stdout
add_filter(self: catana.io.FilterStream, filter: catana.io.Filter) → None

Add a Filter to the pipeline.

Filters will be applied sequentially in the order that they were added to the FilterStream.

Parameters:filter (Filter) – A radial or angular Filter instance
run(self: catana.io.FilterStream, subsample_size: int=0, temp_filename: str='tmp.bin') → int

Run the pipeline

If the number of points should be reduced, set subsample_size to the desired value. If subsample_size>0 the process will turn into a 2-step process, where in the first step all the filters will be applied and the filtered points will be stored in a temporary file. In the second step the remaining points in the temporary file will be randomly subsampled and stored to the sink.

The 2 steps can also be executed manually, using the run_totemp and run_fromtemp functions.

Parameters:
  • subsample_size (int) – number of points to randomly subsample. Set subsample_size=0 to turn off subsampling.
  • temp_filename (str) – path to the temporary file (used only for subsampling)
Returns:

int – number of points written to the sink

run_fromtemp(self: catana.io.FilterStream, temp_filename: str='tmp.bin', subsample_size: int=0, remove_temp: bool=True) → int

Run second intermediate step: write temporary file to sink (with subsampling)

Parameters:
  • temp_filename (str) – path to the temporary file (from previous step)
  • subsample_size (int) – number of points to randomly subsample. Set to 0 to turn off subsampling
  • remove_temp (bool) – If True the temporary file will be deleted once it is read.
Returns:

int – number of points written to the sink

run_totemp(self: catana.io.FilterStream, temp_filename: str='tmp.bin', append: bool=True) → int

Run first intermediate step: write source to temporary file, apply filters

Parameters:
  • temp_filename (str) – path to the temporary file
  • append (bool) – if True, points will be appended to the end of the file. Else the temporary file will be overwritten.
Returns:

int – number of points written to the temporary file

set_source(self: catana.io.FilterStream, arg0: catana.io.Source) → None

Reset the source to a new source

Can be used to read from multiple sources to the same sink. To do that, use the run_totemp function with append=True instead of the run method (for all sources). Then call the run_fromtemp function.

Parameters:source (Source) – a source instance