Particle Catalogue#
The core input data for clustering statistics are particle catalogues
which are implemented in Triumvirate as the
ParticleCatalogue
class.
from triumvirate.catalogue import ParticleCatalogue
Initialisation#
A ParticleCatalogue
object can be
initialised either from existing data arrays or from an external file.
From data arrays#
Let’s first create some mock data arrays as the particle Cartesian coordinates and weights in the catalogue object:
# DEMO: create mock data arrays from a fixed random seed.
import numpy as np
rng = np.random.default_rng(seed=42)
nparticles = 100 # number of particles
boxsize = 100. # boxsize (in Mpc/h)
x, y, z = rng.uniform(0., boxsize, size=(3, nparticles)) # Cartesian coords.
nz = nparticles / boxsize**3 # background density
ws = rng.uniform(0., 2., size=nparticles) # sample weight
wc = np.ones(nparticles) # clustering weight
Now pass these mock data arrays to initialise a particle catalogue with
Cartesian coordinates x
, y
and z
, sample weight ws
(a combination
of weights to correct for e.g. imaging systematics, fibre collision and
completeness), and clustering weight wc
(e.g. Feldman–Kaiser–Peacock-type
weights) (see also the note here).
The redshift-dependent background density is set to a constant
mean value nz
.
catalogue = ParticleCatalogue(x, y, z, nz=nz, ws=ws, wc=wc)
By default, if ws
or wc
is not provided, they are set to unity. If
nz
is not provided, a warning message is displayed.
# DEMO: verify a warning message is shown and the default weights
# are unity.
import warnings
with warnings.catch_warnings(record=True) as unset_warnings:
warnings.filterwarnings('default', category=UserWarning)
catalogue_ = ParticleCatalogue(x, y, z)
print("Caught warning:", unset_warnings[-1].message)
print("'ws' weights set to default value 1:", np.allclose(catalogue_['ws'], 1.))
print("'wc' weights set to default value 1:", np.allclose(catalogue_['wc'], 1.))
Caught warning: Catalogue 'nz' field is None and thus set to zero, which may raise errors in some computations.
'ws' weights set to default value 1: True
'wc' weights set to default value 1: True
From an external file#
For the purpose of demonstration, we shall reuse the mock data arrays above by
first saving it to a plain-text file "mock_catalogue.dat"
in the current
working directory.
# DEMO: create a mock catalogue.
mock_catalogue_filepath = "mock_catalogue.dat"
mock_catalogue_fields = ['x', 'y', 'z', 'nz', 'ws', 'wc']
mock_catalogue_pdata = np.column_stack([
catalogue[name] for name in mock_catalogue_fields
])
np.savetxt(mock_catalogue_filepath, mock_catalogue_pdata)
One point of interest above is the implicit use of the
__getitem__()
method,
which returns a particle data column given the column name. (If the
backend data type is implemented using astropy.table.Table
,
then multiple data columns are returned for a sequence of column names.)
Now we can read the saved mock catalogue file using two different ‘readers’:
one based on astropy
and another based on nbodykit
(if installed, e.g.
as an optional extra). These readers may require different values of the
format
amd names
argument to be specified (see this
note here).
# Use the 'astropy' reader.
catalogue_astropy = ParticleCatalogue.read_from_file(
mock_catalogue_filepath, reader='astropy',
names=mock_catalogue_fields
)
# Use the 'nbodykit' reader (if available).
catalogue_nbodykit = ParticleCatalogue.read_from_file(
mock_catalogue_filepath, reader='nbodykit', format='text',
names=mock_catalogue_fields
)
Attention
The mock catalogues created here are for syntax illustration only and do not have realistic clustering.
Catalogue attributes#
ParticleCatalogue
has both explicit internal
attributes and derived (external) attributes.
Internal properties#
One could access the particle coordinate extents, the number of particles and the sum of sample weights easily:
print("Particle extents:", catalogue.bounds)
print("Particle number:", catalogue.ntotal)
print("Particle sample weight total:", catalogue.wtotal)
Particle extents: {'x': (0.7362269751005512, 97.5622351636756), 'y': (2.1612079880330426, 99.2375564055837), 'z': (1.3936287708201545, 97.95706805865927)}
Particle number: 100
Particle sample weight total: 105.24499954312556
Derived quantities#
Mean number density—One could compute the mean number density assuming a cubic box volume provided either directly or as box sizes, which is then used as the redshift-dependent background number density for various calculations.
# Use the total cubic volume.
catalogue.compute_mean_density(volume=boxsize**3)
# Use the box size(s).
catalogue.compute_mean_density(boxsize=boxsize) # equivalent
catalogue.compute_mean_density(boxsize=[boxsize,]*3) # equivalent
# DEMO: check 'nz' values.
print("'nz' column set correctly:", np.allclose(catalogue['nz'], nz))
'nz' column set correctly: True
Warning
The invocation of this method resets the particle data column 'nz'
.
This method has no return value.
Line of sight—A crucial quantity in clustering statistics is the (unit-normalised) line-of-sight vector, which can be calculated for each particle in the catalogue as follows:
los = catalogue.compute_los()
print("Line of sight for the first two particles:\n", los[:2]) # DEMO
Line of sight for the first two particles:
[[0.54335988 0.63787381 0.54578112]
[0.34410981 0.54861682 0.76197639]]
Attribute header#
The various catalogue attributes can be written out as a header text stream for I/O purposes:
header = catalogue.write_attrs_as_header()
print(header) # DEMO
Catalogue source: extdata:140361956766336
Catalogue size: ntotal = 100, wtotal = 105.245, wstotal = 105.245
Catalogue particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])
Since many clustering algorithms require both a data catalogue
and a random one, the latter can be passed to the catalogue_ref
argument
to be included in the header. Let’s reuse the catalogue_
(created above) as catalogue_ref
.
header = catalogue.write_attrs_as_header(catalogue_ref=catalogue_)
print(header) # DEMO
Data catalogue source: extdata:140361956766336
Data catalogue size: ntotal = 100, wtotal = 105.245, wstotal = 105.245
Data-source particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])
Random catalogue source: extdata:140362245915840
Random catalogue size: ntotal = 100, wtotal = 100.000, wstotal = 100.000
Random-source particle extents: ([0.736, 97.562], [2.161, 99.238], [1.394, 97.957])
Box alignment#
For fast-Fourier transforms used to compute clustering statistics, particles in a catalogue are placed inside a cuboid box. Depending on the algorithm, there are multiple options for aligning the box with the catalogue particles.
Attention
Algorithmic functions in Triumvirate for computing clustering statistics
automatically perform box alignment depending on the alignment
parameter
in the ParameterSet
object passed to it (if
None
, the default value is used).
Centring#
The most common alignment choice is to put the mid-point of the particle coordinate extents at the centre of the box. To do so, the box size(s) must be specified.
print(
"Pre-centring mid-point:",
{
ax: round(np.mean(ax_bounds), 6)
for ax, ax_bounds in catalogue.bounds.items()
}
) # DEMO
catalogue.centre(boxsize=boxsize, catalogue_ref=None)
print(
"Post-centring mid-point:",
{
ax: round(np.mean(ax_bounds), 6)
for ax, ax_bounds in catalogue.bounds.items()
}
) # DEMO
Pre-centring mid-point: {'x': 49.149231, 'y': 50.699382, 'z': 49.675348}
Post-centring mid-point: {'x': 50.0, 'y': 50.0, 'z': 50.0}
As mentioned above, if there is a second catalogue (e.g. a random catalogue
in conjunction with a data one), it can be used as a reference catalogue such
that all particles in both catalogues are shifted by the same displacement
vector so that the mid-point of the particle coordinate extents in the
reference catalogue is at the box centre. To do so, replace
catalogue_ref=None
with e.g. catalogue_ref=catalogue_
(created above).
Padding#
Another alignment choice is to place padding between particles and the origin corner of the box. The amount of padding can be specified as a (sequence of) factors/multiples of the box size(s) or the grid cell size(s) of a mesh grid.
from pprint import pformat # DEMO: formatted printing of dictionaries
# DEMO
print("Pre-padding particle extents:\n", pformat(catalogue.bounds))
# Use padding as a factor of the box size(s).
catalogue.pad(
boxsize, # or: [boxsize,]*3
boxsize_pad=0.02, # or: [0.01, 0.02, 0.03]
catalogue_ref=None # or: catalogue_
)
# DEMO
print("Post-padding particle extents:\n", pformat(catalogue.bounds))
# DEMO
print("Pre-padding particle extents:\n", pformat(catalogue.bounds))
# Use padding as a factor of the grid cell size(s).
catalogue.pad(
boxsize, # or: [boxsize,]*3
ngrid=64, # or: [64, 64, 32]
ngrid_pad=1., # or: [0.8, 1., 1.2]
catalogue_ref=None # or: catalogue_
)
# DEMO
print("Post-padding particle extents:\n", pformat(catalogue.bounds))
Pre-padding particle extents:
{'x': (1.5869959057124787, 98.41300409428752),
'y': (1.4618257912246704, 98.53817420877533),
'z': (1.7182803560804385, 98.28171964391956)}
Post-padding particle extents:
{'x': (2.0, 98.82600818857503),
'y': (2.0, 99.07634841755066),
'z': (2.0, 98.56343928783912)}
Pre-padding particle extents:
{'x': (2.0, 98.82600818857503),
'y': (2.0, 99.07634841755066),
'z': (2.0, 98.56343928783912)}
Post-padding particle extents:
{'x': (1.5625, 98.38850818857503),
'y': (1.5625, 98.63884841755066),
'z': (1.5625, 98.12593928783912)}
Again, if catalogue_ref
is specified, the reference catalogue is padded
inside the box, and the same displacement vector is applied to the
original catalogue itself.
Periodic boundary conditions#
For simulation-like catalogues, periodic boundary conditions can be enforced as follows:
# DEMO
print("Pre-periodisation particle extents:\n", pformat(catalogue.bounds))
# DEMO: a warning message will be emitted below as the period is smaller
# than the particle extents.
boxsize_period = 50.
catalogue.periodise(boxsize_period)
# DEMO
print("Post-periodisation particle extents:\n", pformat(catalogue.bounds))
Pre-periodisation particle extents:
{'x': (1.5625, 98.38850818857503),
'y': (1.5625, 98.63884841755066),
'z': (1.5625, 98.12593928783912)}
Post-periodisation particle extents:
{'x': (0.32698452139365486, 49.24471751714601),
'y': (0.15257078671264424, 49.897126859951904),
'z': (0.10860788039610014, 49.926079609604926)}
/Users/mikesw/Documents/Projects/Triumvirate/src/triumvirate/catalogue.py:634: UserWarning: Box size is smaller than particle coordinate extents along axis: ['x', 'y', 'z'].
warnings.warn(
Constant offset#
One could also choose to apply a displacement vector to all particles in the catalogue. The supplied position vector is used to define the new origin.
# DEMO
print("Pre-offset particle extents:\n", pformat(catalogue.bounds))
offset_position = [-20., -25., -30.]
catalogue.offset_coords(offset_position)
# DEMO
print("Post-offset particle extents:\n", pformat(catalogue.bounds))
Pre-offset particle extents:
{'x': (0.32698452139365486, 49.24471751714601),
'y': (0.15257078671264424, 49.897126859951904),
'z': (0.10860788039610014, 49.926079609604926)}
Post-offset particle extents:
{'x': (20.326984521393655, 69.244717517146),
'y': (25.152570786712644, 74.8971268599519),
'z': (30.1086078803961, 79.92607960960493)}