Particle Catalogue#

The core input data for clustering statistics are particle catalogues which are implemented in Triumvirate as the ParticleCatalogue class.

from triumvirate.catalogue import ParticleCatalogue

Initialisation#

A ParticleCatalogue object can be initialised either from existing data arrays or from an external file.

From data arrays#

Let’s first create some mock data arrays as the particle Cartesian coordinates and weights in the catalogue object:

# DEMO: create mock data arrays from a fixed random seed.

import numpy as np
rng = np.random.default_rng(seed=42)

nparticles = 100  # number of particles
boxsize = 100.    # boxsize (in Mpc/h)

x, y, z = rng.uniform(0., boxsize, size=(3, nparticles))  # Cartesian coords.
nz = nparticles / boxsize**3                              # background density
ws = rng.uniform(0., 2., size=nparticles)                 # sample weight
wc = np.ones(nparticles)                                  # clustering weight

Now pass these mock data arrays to initialise a particle catalogue with Cartesian coordinates x, y and z, sample weight ws (a combination of weights to correct for e.g. imaging systematics, fibre collision and completeness), and clustering weight wc (e.g. Feldman–Kaiser–Peacock-type weights) (see also the note here). The redshift-dependent background density is set to a constant mean value nz.

catalogue = ParticleCatalogue(x, y, z, nz=nz, ws=ws, wc=wc)

By default, if ws or wc is not provided, they are set to unity. If nz is not provided, a warning message is displayed.

# DEMO: verify a warning message is shown and the default weights
# are unity.

import warnings

with warnings.catch_warnings(record=True) as unset_warnings:
    warnings.filterwarnings('default', category=UserWarning)
    catalogue_ = ParticleCatalogue(x, y, z)
    print("Caught warning:", unset_warnings[-1].message)

print("'ws' weights set to default value 1:", np.allclose(catalogue_['ws'], 1.))
print("'wc' weights set to default value 1:", np.allclose(catalogue_['wc'], 1.))
Caught warning: Catalogue 'nz' field is None and thus set to zero, which may raise errors in some computations.
'ws' weights set to default value 1: True
'wc' weights set to default value 1: True

From an external file#

For the purpose of demonstration, we shall reuse the mock data arrays above by first saving it to a plain-text file "mock_catalogue.dat" in the current working directory.

# DEMO: create a mock catalogue.

mock_catalogue_filepath = "mock_catalogue.dat"
mock_catalogue_fields = ['x', 'y', 'z', 'nz', 'ws', 'wc']
mock_catalogue_pdata = np.column_stack([
    catalogue[name] for name in mock_catalogue_fields
])

np.savetxt(mock_catalogue_filepath, mock_catalogue_pdata)

One point of interest above is the implicit use of the __getitem__() method, which returns a particle data column given the column name. (If the backend data type is implemented using astropy.table.Table, then multiple data columns are returned for a sequence of column names.)

Now we can read the saved mock catalogue file using two different ‘readers’: one based on astropy and another based on nbodykit (if installed, e.g. as an optional extra). These readers may require different values of the format amd names argument to be specified (see this note here).

# Use the 'astropy' reader.
catalogue_astropy = ParticleCatalogue.read_from_file(
    mock_catalogue_filepath, reader='astropy',
    names=mock_catalogue_fields
)

# Use the 'nbodykit' reader (if available).
catalogue_nbodykit = ParticleCatalogue.read_from_file(
    mock_catalogue_filepath, reader='nbodykit', format='text',
    names=mock_catalogue_fields
)

Attention

The mock catalogues created here are for syntax illustration only and do not have realistic clustering.

Catalogue attributes#

ParticleCatalogue has both explicit internal attributes and derived (external) attributes.

Internal properties#

One could access the particle coordinate extents, the number of particles and the sum of sample weights easily:

print("Particle extents:", catalogue.bounds)
print("Particle number:", catalogue.ntotal)
print("Particle sample weight total:", catalogue.wtotal)
Particle extents: {'x': (0.7362269751005512, 97.5622351636756), 'y': (2.1612079880330426, 99.2375564055837), 'z': (1.3936287708201545, 97.95706805865927)}
Particle number: 100
Particle sample weight total: 105.24499954312556

Derived quantities#

Mean number density—One could compute the mean number density assuming a cubic box volume provided either directly or as box sizes, which is then used as the redshift-dependent background number density for various calculations.

# Use the total cubic volume.
catalogue.compute_mean_density(volume=boxsize**3)

# Use the box size(s).
catalogue.compute_mean_density(boxsize=boxsize)       # equivalent
catalogue.compute_mean_density(boxsize=[boxsize,]*3)  # equivalent

# DEMO: check 'nz' values.
print("'nz' column set correctly:", np.allclose(catalogue['nz'], nz))
'nz' column set correctly: True

Warning

The invocation of this method resets the particle data column 'nz'. This method has no return value.

Line of sight—A crucial quantity in clustering statistics is the (unit-normalised) line-of-sight vector, which can be calculated for each particle in the catalogue as follows:

los = catalogue.compute_los()

print("Line of sight for the first two particles:\n", los[:2])  # DEMO
Line of sight for the first two particles:
 [[0.54335988 0.63787381 0.54578112]
 [0.34410981 0.54861682 0.76197639]]

Attribute header#

The various catalogue attributes can be written out as a header text stream for I/O purposes:

header = catalogue.write_attrs_as_header()

print(header)  # DEMO
Catalogue source: extdata:140361956766336
Catalogue size: ntotal = 100, wtotal = 105.245, wstotal = 105.245
Catalogue particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])

Since many clustering algorithms require both a data catalogue and a random one, the latter can be passed to the catalogue_ref argument to be included in the header. Let’s reuse the catalogue_ (created above) as catalogue_ref.

header = catalogue.write_attrs_as_header(catalogue_ref=catalogue_)

print(header)  # DEMO
Data catalogue source: extdata:140361956766336
Data catalogue size: ntotal = 100, wtotal = 105.245, wstotal = 105.245
Data-source particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])
Random catalogue source: extdata:140362245915840
Random catalogue size: ntotal = 100, wtotal = 100.000, wstotal = 100.000
Random-source particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])

Box alignment#

For fast-Fourier transforms used to compute clustering statistics, particles in a catalogue are placed inside a cuboid box. Depending on the algorithm, there are multiple options for aligning the box with the catalogue particles.

Attention

Algorithmic functions in Triumvirate for computing clustering statistics automatically perform box alignment depending on the alignment parameter in the ParameterSet object passed to it (if None, the default value is used).

Centring#

The most common alignment choice is to put the mid-point of the particle coordinate extents at the centre of the box. To do so, the box size(s) must be specified.

print(
    "Pre-centring mid-point:",
    {
        ax: round(np.mean(ax_bounds), 6)
        for ax, ax_bounds in catalogue.bounds.items()
    }
)  # DEMO

catalogue.centre(boxsize=boxsize, catalogue_ref=None)

print(
    "Post-centring mid-point:",
    {
        ax: round(np.mean(ax_bounds), 6)
        for ax, ax_bounds in catalogue.bounds.items()
    }
)  # DEMO
Pre-centring mid-point: {'x': 49.149231, 'y': 50.699382, 'z': 49.675348}
Post-centring mid-point: {'x': 50.0, 'y': 50.0, 'z': 50.0}

As mentioned above, if there is a second catalogue (e.g. a random catalogue in conjunction with a data one), it can be used as a reference catalogue such that all particles in both catalogues are shifted by the same displacement vector so that the mid-point of the particle coordinate extents in the reference catalogue is at the box centre. To do so, replace catalogue_ref=None with e.g. catalogue_ref=catalogue_ (created above).

Padding#

Another alignment choice is to place padding between particles and the origin corner of the box. The amount of padding can be specified as a (sequence of) factors/multiples of the box size(s) or the grid cell size(s) of a mesh grid.

from pprint import pformat  # DEMO: formatted printing of dictionaries

# DEMO
print("Pre-padding particle extents:\n", pformat(catalogue.bounds))

# Use padding as a factor of the box size(s).
catalogue.pad(
    boxsize,            # or: [boxsize,]*3
    boxsize_pad=0.02,   # or: [0.01, 0.02, 0.03]
    catalogue_ref=None  # or: catalogue_
)

# DEMO
print("Post-padding particle extents:\n", pformat(catalogue.bounds))

# DEMO
print("Pre-padding particle extents:\n", pformat(catalogue.bounds))

# Use padding as a factor of the grid cell size(s).
catalogue.pad(
    boxsize,            # or: [boxsize,]*3
    ngrid=64,           # or: [64, 64, 32]
    ngrid_pad=1.,       # or: [0.8, 1., 1.2]
    catalogue_ref=None  # or: catalogue_
)

# DEMO
print("Post-padding particle extents:\n", pformat(catalogue.bounds))
Pre-padding particle extents:
 {'x': (1.5869959057124787, 98.41300409428752),
 'y': (1.4618257912246704, 98.53817420877533),
 'z': (1.7182803560804385, 98.28171964391956)}
Post-padding particle extents:
 {'x': (2.0, 98.82600818857503),
 'y': (2.0, 99.07634841755066),
 'z': (2.0, 98.56343928783912)}
Pre-padding particle extents:
 {'x': (2.0, 98.82600818857503),
 'y': (2.0, 99.07634841755066),
 'z': (2.0, 98.56343928783912)}
Post-padding particle extents:
 {'x': (1.5625, 98.38850818857503),
 'y': (1.5625, 98.63884841755066),
 'z': (1.5625, 98.12593928783912)}

Again, if catalogue_ref is specified, the reference catalogue is padded inside the box, and the same displacement vector is applied to the original catalogue itself.

Periodic boundary conditions#

For simulation-like catalogues, periodic boundary conditions can be enforced as follows:

# DEMO
print("Pre-periodisation particle extents:\n", pformat(catalogue.bounds))

# DEMO: a warning message will be emitted below as the period is smaller
# than the particle extents.
boxsize_period = 50.
catalogue.periodise(boxsize_period)

# DEMO
print("Post-periodisation particle extents:\n", pformat(catalogue.bounds))
Pre-periodisation particle extents:
 {'x': (1.5625, 98.38850818857503),
 'y': (1.5625, 98.63884841755066),
 'z': (1.5625, 98.12593928783912)}
Post-periodisation particle extents:
 {'x': (0.32698452139365486, 49.24471751714601),
 'y': (0.15257078671264424, 49.897126859951904),
 'z': (0.10860788039610014, 49.926079609604926)}
/Users/mikesw/Documents/Projects/Triumvirate/src/triumvirate/catalogue.py:634: UserWarning: Box size is smaller than particle coordinate extents along axis: ['x', 'y', 'z'].
  warnings.warn(

Constant offset#

One could also choose to apply a displacement vector to all particles in the catalogue. The supplied position vector is used to define the new origin.

# DEMO
print("Pre-offset particle extents:\n", pformat(catalogue.bounds))

offset_position = [-20., -25., -30.]
catalogue.offset_coords(offset_position)

# DEMO
print("Post-offset particle extents:\n", pformat(catalogue.bounds))
Pre-offset particle extents:
 {'x': (0.32698452139365486, 49.24471751714601),
 'y': (0.15257078671264424, 49.897126859951904),
 'z': (0.10860788039610014, 49.926079609604926)}
Post-offset particle extents:
 {'x': (20.326984521393655, 69.244717517146),
 'y': (25.152570786712644, 74.8971268599519),
 'z': (30.1086078803961, 79.92607960960493)}