Input/Output Module¶
python binding for in/output of particle positions (part of CatAna)
The catana.io
module contains functions and classes to read data from files and containers, filters that can be
applied to objects (points) and facilites to write those points to files and containers. The implementation follows
a “data-stream” concept, where the data flows from a source through one or multiple filters to a so-called sink.
The available sources, sinks and filters are listed below. The data-stream pipeline is implemented in the FilterStream
class. If only reading from a file to a PointContainer
or PixelizedPointContainer
is required you can directly use
the Source.get_point_container
and Source.get_pixelized_point_container
functions.
Sources¶
CartesianTextSource¶
-
class
catana.io.
CartesianTextSource
(self: catana.io.CartesianTextSource, filename: str, shift: float=0.0) → None¶ Read positions from Text File (cartesian coordinates)
Only the first 3 columns are read, remaining columns are ignored.
Note
To convert box coordinates to spherical coordinates, the origin is placed at
[0, 0, 0]
. If the coordinates are not centered at 0 they can be moved by using the shift parameter (e.g. if the coordinates are within [0, L], useshift=-L/2
)Constructor from a textfile filename.
Parameters: - filename (str) – path to file containing the positions
- shift (float) – a coordinate shift that is applied to each axis
-
get_pixelized_point_container
(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer¶ Load all data into a PixelizedPointContainer with given NSide
Parameters: nside (int) – HEALPix pixelization resolution, must be a power of 2 Returns: PixelizedPointContainer
– A pixelized point container with all the data
-
get_point_container
(self: catana.io.Source) → catana.basictypes.PointContainer¶ Load all data into an PointContainer.
Returns: PointContainer
– A PointContainer with all data
-
reset
(self: catana.io.Source) → None¶ Reset the source
This reverts the Source to the initial state and reading from it again should produce the same results
SphericalTextSource¶
-
class
catana.io.
SphericalTextSource
(self: catana.io.SphericalTextSource, filename: str) → None¶ Read from Text File (spherical coordinates [r, theta, phi])
Only the first 3 columns are read, remaining columns are ignored.
Constructor from a textfile filename.
Parameters: filename (str) – path to file containing the positions Note
For 3dex compatible text files with columns [theta, phi, r], use the
SphericalTextSource_3dex
class with the same functionality and syntax.-
get_pixelized_point_container
(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer¶ Load all data into a PixelizedPointContainer with given NSide
Parameters: nside (int) – HEALPix pixelization resolution, must be a power of 2 Returns: PixelizedPointContainer
– A pixelized point container with all the data
-
get_point_container
(self: catana.io.Source) → catana.basictypes.PointContainer¶ Load all data into an PointContainer.
Returns: PointContainer
– A PointContainer with all data
-
reset
(self: catana.io.Source) → None¶ Reset the source
This reverts the Source to the initial state and reading from it again should produce the same results
-
GadgetSource¶
-
class
catana.io.
GadgetSource
(self: catana.io.GadgetSource, filename: str, verbose: bool=True) → None¶ Read positions from binary Gadget file
This reader will only read particles of type 1 from version 2 binary Gadget files.
Constructor from a Gadget filename.
Parameters: - filename (str) – path to the Gadget file
- verbose (bool) – if
True
, print logging information to stdout (might be in terminal for jupyter notebooks!)
-
get_pixelized_point_container
(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer¶ Load all data into a PixelizedPointContainer with given NSide
Parameters: nside (int) – HEALPix pixelization resolution, must be a power of 2 Returns: PixelizedPointContainer
– A pixelized point container with all the data
-
get_point_container
(self: catana.io.Source) → catana.basictypes.PointContainer¶ Load all data into an PointContainer.
Returns: PointContainer
– A PointContainer with all data
-
reset
(self: catana.io.Source) → None¶ Reset the source
This reverts the Source to the initial state and reading from it again should produce the same results
PointContainerSource¶
-
class
catana.io.
PointContainerSource
(self: catana.io.PointContainerSource, point_container: catana.basictypes.PointContainer) → None¶ Read positions from a PointContainer
Mainly used to add a
PointContainer
to aFilterStream
.Constructor from an existing
PointContainer
.Parameters: point_container (PointContainer) – the PointContainer from which to read -
get_pixelized_point_container
(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer¶ Load all data into a PixelizedPointContainer with given NSide
Parameters: nside (int) – HEALPix pixelization resolution, must be a power of 2 Returns: PixelizedPointContainer
– A pixelized point container with all the data
-
get_point_container
(self: catana.io.Source) → catana.basictypes.PointContainer¶ Load all data into an PointContainer.
Returns: PointContainer
– A PointContainer with all data
-
reset
(self: catana.io.Source) → None¶ Reset the source
This reverts the Source to the initial state and reading from it again should produce the same results
-
RawBinarySource¶
-
class
catana.io.
RawBinarySource
(self: catana.io.RawBinarySource, filename: str, verbose: bool=True) → None¶ Read from CatAna binary file (float precision, spherical coordinates)
Constructor from a binary file filename.
Parameters: - filename (str) – path to the binary file
- verbose (bool) – if
True
, print logging information to stdout (might be in terminal for jupyter notebooks!)
-
get_pixelized_point_container
(self: catana.io.Source, nside: int) → catana.basictypes.PixelizedPointContainer¶ Load all data into a PixelizedPointContainer with given NSide
Parameters: nside (int) – HEALPix pixelization resolution, must be a power of 2 Returns: PixelizedPointContainer
– A pixelized point container with all the data
-
get_point_container
(self: catana.io.Source) → catana.basictypes.PointContainer¶ Load all data into an PointContainer.
Returns: PointContainer
– A PointContainer with all data
-
reset
(self: catana.io.Source) → None¶ Reset the source
This reverts the Source to the initial state and reading from it again should produce the same results
Sinks¶
Sink¶
-
class
catana.io.
Sink
¶ (Abstract) parent class for Sink implementations
These are used to write data of type Point to various targets.
CartesianTextSink¶
-
class
catana.io.
CartesianTextSink
(self: catana.io.CartesianTextSink, filename: str, verbose: bool=True) → None¶ A Sink to write
Point
data into a text-file using cartesian coordinatesWarning
The file will be truncated if it already exists
Create Sink text-file
Parameters: - filename (str) – path to file. Will be overwritten if it already exists.
- verbose (bool) – output additional logging information if
True
.
-
write
(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int¶ write entire point_container directly to sink
Parameters: point_container (PointContainer) – source container to write Returns: int – number of points written. -1 if it failed
SphericalTextSink¶
-
class
catana.io.
SphericalTextSink
(self: catana.io.SphericalTextSink, filename: str, verbose: bool=True) → None¶ A Sink to write
Point
data into a text-file using spherical coordinates [r, theta, phi]Warning
The file will be truncated if it already exists
Create Sink text-file
Parameters: - filename (str) – path to file. Will be overwritten if it already exists.
- verbose (bool) – output additional logging information if
True
.
Note
For 3dex compatible text files with columns [theta, phi, r], use the
SphericalTextSink_3dex
class with the same functionality and syntax.-
write
(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int¶ write entire point_container directly to sink
Parameters: point_container (PointContainer) – source container to write Returns: int – number of points written. -1 if it failed
PointContainerSink¶
-
class
catana.io.
PointContainerSink
(self: catana.io.PointContainerSink, point_container: catana.basictypes.PointContainer) → None¶ A Sink to write
Point
data into aPointContainer
Constructor from an existing
PointContainer
Parameters: point_container (PointContainer) – container to write the points to -
write
(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int¶ write entire point_container directly to sink
Parameters: point_container (PointContainer) – source container to write Returns: int – number of points written. -1 if it failed
-
RawBinarySink¶
-
class
catana.io.
RawBinarySink
(self: catana.io.RawBinarySink, filename: str, verbose: bool=True, append: bool=False) → None¶ A Sink to write
Point
data into a CatAna binary file (float precision, spherical coordinates)Create Sink binary-file
Parameters: - filename (str) – path to file. Will be overwritten if it already exists.
- verbose (bool) – output additional logging information if
True
. - append (bool) – if
True
, Points will be appended to the file. IfFalse
, file will be truncated.
-
write
(self: catana.io.Sink, point_container: catana.basictypes.PointContainer) → int¶ write entire point_container directly to sink
Parameters: point_container (PointContainer) – source container to write Returns: int – number of points written. -1 if it failed
Filters¶
Filter (abstract class)¶
-
class
catana.io.
Filter
¶ (Abstract) parent class for point-filtering implementations.
TophatRadialWindowFunctionFilter¶
-
class
catana.io.
TophatRadialWindowFunctionFilter
(self: catana.io.TophatRadialWindowFunctionFilter, R0: float) → None¶ TopHat Radial WindowFunction Filter
Constructs a tophat filter with radius
R0
Parameters: R0 (float) – radius of tophat -
__call__
(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int¶ Apply filter to a PointContainer
Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.
Parameters: point_container (PointContainer) – the container on which the filter is applied.
-
GaussianRadialWindowFunctionFilter¶
-
class
catana.io.
GaussianRadialWindowFunctionFilter
(self: catana.io.GaussianRadialWindowFunctionFilter, R0: float) → None¶ Gaussian Radial WindowFunction Filter
The acceptance probability is p = exp(-(r/R0)^2)
Constructs a gaussian radial filter with scale factor R0
Parameters: R0 (float) – scale factor -
__call__
(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int¶ Apply filter to a PointContainer
Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.
Parameters: point_container (PointContainer) – the container on which the filter is applied.
-
GenericRadialWindowFunctionFilter¶
-
class
catana.io.
GenericRadialWindowFunctionFilter
(*args, **kwargs)¶ Generic Radial WindowFunction Filter
This filter accepts a generic function f: [0, rmax] -> [0, 1] where the function value is the probability that the point with the given radius is accepted.
Note
Using this functionality in Python causes many callback function which are slow. To speed up filtering with a generic radial probability function, use the constructor with
interpolation_points
in the interval [min, max].Overloaded function.
__init__(self: catana.io.GenericRadialWindowFunctionFilter, function: std::function<double (double)>) -> None
Constructor from a python callable
This will cause callbacks which can be very slow for a large
Source
!- Parameters:
function (callable): a function returning the acceptance probability for a given radius
__init__(self: catana.io.GenericRadialWindowFunctionFilter, function: std::function<double (double)>, interpolation_points: int, min: float, max: float) -> None
Constructor from a python callable, with interpolation
- Parameters:
- function (callable): a function returning the acceptance probability for a given radius. should be
sufficiently smooth!
- interpolation_points (int): number of sampling points which will be equidistantly distributed between
[min, max]
- min (float): lower-bound for interpolation range, should be smaller or equal to the shortest radius to
which the filter is applied
- max (float): upper-bound for interpolation range, should be larger or equal to the longest radius to
which the filter is applied
-
__call__
(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int¶ Apply filter to a PointContainer
Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.
Parameters: point_container (PointContainer) – the container on which the filter is applied.
AngularMaskFilter¶
-
class
catana.io.
AngularMaskFilter
(*args, **kwargs)¶ HEALPix Angular Mask Filter
Uses a HEALPix map, where each pixel should be either 1 or 0, where 1 corresponds to keep the point if its angular coordinates are within the pixel and 0 will remove it.
Overloaded function.
- __init__(self: catana.io.AngularMaskFilter, mask_file: str) -> None
Constructor from a HEALPix FITS file
Parameters: mask_file (str) – path to the HEALPix mask map. - __init__(self: catana.io.AngularMaskFilter, map: numpy.ndarray[float32[m, 1]]) -> None
Constructor from a HEALPix map array
Parameters: map (numpy.ndarray[float32]) – valid HEALPix map, in RING format, containing only 1 and 0 pixel values -
__call__
(self: catana.io.Filter, point_container: catana.basictypes.PointContainer) → int¶ Apply filter to a PointContainer
Runs the filter on all elements in the point_container, removing filtered elements and resizing the container.
Parameters: point_container (PointContainer) – the container on which the filter is applied.
FilterStream¶
To combine a Source -> Filter -> Sink together, we can use a FilterStream
-
class
catana.io.
FilterStream
(self: catana.io.FilterStream, source: catana.io.Source, sink: catana.io.Sink, buffer_size: int=1000000, verbose: bool=True) → None¶ A Stream which connects a
Source
to aSink
and applies filters (Filter
).The FilterStream class is used to move points (
Point
) from the source to the sink while applying (multiple) filters and buffering the stream to speed up the transfer. Sources, Sinks and Filters can be any objects derrived from theSource
,Sink
andFilter
parent classes.Constructor
Parameters: -
add_filter
(self: catana.io.FilterStream, filter: catana.io.Filter) → None¶ Add a Filter to the pipeline.
Filters will be applied sequentially in the order that they were added to the FilterStream.
Parameters: filter (Filter) – A radial or angular Filter instance
-
run
(self: catana.io.FilterStream, subsample_size: int=0, temp_filename: str='tmp.bin') → int¶ Run the pipeline
If the number of points should be reduced, set
subsample_size
to the desired value. Ifsubsample_size>0
the process will turn into a 2-step process, where in the first step all the filters will be applied and the filtered points will be stored in a temporary file. In the second step the remaining points in the temporary file will be randomly subsampled and stored to the sink.The 2 steps can also be executed manually, using the
run_totemp
andrun_fromtemp
functions.Parameters: - subsample_size (int) – number of points to randomly subsample. Set
subsample_size=0
to turn off subsampling. - temp_filename (str) – path to the temporary file (used only for subsampling)
Returns: int – number of points written to the sink
- subsample_size (int) – number of points to randomly subsample. Set
-
run_fromtemp
(self: catana.io.FilterStream, temp_filename: str='tmp.bin', subsample_size: int=0, remove_temp: bool=True) → int¶ Run second intermediate step: write temporary file to sink (with subsampling)
Parameters: - temp_filename (str) – path to the temporary file (from previous step)
- subsample_size (int) – number of points to randomly subsample. Set to 0 to turn off subsampling
- remove_temp (bool) – If
True
the temporary file will be deleted once it is read.
Returns: int – number of points written to the sink
-
run_totemp
(self: catana.io.FilterStream, temp_filename: str='tmp.bin', append: bool=True) → int¶ Run first intermediate step: write source to temporary file, apply filters
Parameters: - temp_filename (str) – path to the temporary file
- append (bool) – if
True
, points will be appended to the end of the file. Else the temporary file will be overwritten.
Returns: int – number of points written to the temporary file
-
set_source
(self: catana.io.FilterStream, arg0: catana.io.Source) → None¶ Reset the source to a new source
Can be used to read from multiple sources to the same sink. To do that, use the
run_totemp
function withappend=True
instead of therun
method (for all sources). Then call therun_fromtemp
function.Parameters: source (Source) – a source instance
-