Clustering Measurements#
Triumvirate provides the algorithms for computing clustering statistics in both Fourier and configuration space and in both local and global plane-parallel approximations (see Background for details).
The usage of these measurement algorithms is all very similar, so as an example we will mainly consider the bispectrum measurement below, and briefly mention the differences for other measurements.
from triumvirate.threept import compute_bispec
Ingredients#
There are a number of inputs for measurements:
catalogue objects (as
ParticleCatalogue
; see also Particle Catalogue);measurement parameters (as
ParameterSet
or passed/overriden by keyword arguments; see also Parameter Set);optional logger (as
logging.Logger
; see also Customised Logger).
We will reuse trv_logger
, parameter_set
and binning
created in the
Customised Logger, Parameter Set
and Binnig Scheme tutorials as inputs.
In addition, we have used nbodykit
to produce three types of
mock catalogues:
The first is a simulation-like log-normal catalogue
catalogue_sim
in a cubic box of size \(L = 1000\,h^{-1}\,\mathrm{Mpc}\) with number density \(\bar{n} = 5 \times 10^{-4} \,h^3\,\mathrm{Mpc}^{-3}\). The input cosmological parameters are \(h = 0.6736, \Omega_{\mathrm{CDM},0} = 0.2645, \Omega_{\mathrm{b},0} = 0.04930, A_s = 2.083 \times 10^{-9}\) and \(n_s = 0.9649\), and the linear power spectrum at redshift \(z = 1\) with linear tracer bias \(b_1 = 2\) is used.The second is a survey-like catalogue
catalogue_survey
based on the simulation-like one, with the catalogue cut to the inscribing sphere of radius \(L/2\) inside the cubic box.The third is a uniform random catalogue
catalogue_rand
with number density \(5 \bar{n}\) in the same spherical volume as the survey-like one.
Following the Particle Catalogue tutorial, these
catalogues are instantiated as
ParticleCatalogue
.
Measurements#
Having specified all the inputs, measurements can be made by simply passing them as arguments to the relevant function:
results = compute_bispec(
catalogue_survey, catalogue_rand,
paramset=parameter_set,
logger=trv_logger
)
[2023-10-04 13:37:15 (+00:00:01) INFO] Parameter set have been initialised.
[2023-10-04 13:37:15 (+00:00:01) STAT C++] Parameters validated.
[2023-10-04 13:37:15 (+00:00:01) INFO] Binning has been initialised.
[2023-10-04 13:37:15 (+00:00:01) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:15 (+00:00:01) INFO] Catalogues have been aligned.
[2023-10-04 13:37:15 (+00:00:01) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:15 (+00:00:01) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:15 (+00:00:01) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:16 (+00:00:02) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.550956e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:16 (+00:00:02) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:17 (+00:00:03) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:16 (+00:00:02) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:17 (+00:00:03) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:17 (+00:00:03) STAT C++] ... computed bispectrum from paired survey-type catalogues.
Specifying lines of sight#
In the case above, the lines of sight are computed automatically, but one could supply external data arrays as replacements:
# import numpy as np
results = compute_bispec(
catalogue_survey, catalogue_rand,
los_data=np.ones((len(catalogue_survey), 3)),
los_rand=np.ones((len(catalogue_rand), 3)),
paramset=parameter_set,
logger=trv_logger
)
[2023-10-04 13:37:17 (+00:00:03) INFO] Parameter set have been initialised.
[2023-10-04 13:37:17 (+00:00:03) STAT C++] Parameters validated.
[2023-10-04 13:37:17 (+00:00:03) INFO] Binning has been initialised.
[2023-10-04 13:37:17 (+00:00:03) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:17 (+00:00:03) INFO] Catalogues have been aligned.
[2023-10-04 13:37:17 (+00:00:03) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:17 (+00:00:03) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:17 (+00:00:03) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:18 (+00:00:04) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.550956e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:18 (+00:00:04) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:19 (+00:00:05) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:19 (+00:00:05) STAT C++] ... computed bispectrum from paired survey-type catalogues.
Substituting for parameter set#
One could also override/bypass paramset
by passing the relevant/required
keyword arguments. In the example below, we directly set the bispectrum
multipole degrees and form, the binning and the mesh assignment parameters
without a paramset
argument; if the paramset
argument was set, its entries
would be overriden by these keyword arguments.
# DEMO
# import warnings
warnings.filterwarnings('ignore', message=".*default values are unchanged.*")
results = compute_bispec(
catalogue_survey, catalogue_rand,
degrees=(1, 1, 0),
binning=binning,
form='row',
idx_bin=5,
sampling_params={
'assignment': 'cic',
'boxsize': [1000.,]*3,
'ngrid': [64,]*3
},
logger=trv_logger
)
[2023-10-04 13:37:19 (+00:00:05) INFO] Validating parameters... (entering C++)
[2023-10-04 13:37:19 (+00:00:05) INFO] ... validated parameters. (exited C++)
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Parameters validated.
[2023-10-04 13:37:19 (+00:00:05) INFO] Parameter set have been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Binning has been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Catalogues have been aligned.
[2023-10-04 13:37:19 (+00:00:05) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:20 (+00:00:06) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:20 (+00:00:07) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.480691e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:20 (+00:00:07) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:21 (+00:00:07) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:22 (+00:00:08) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:21 (+00:00:07) STAT C++] Bispectrum term at orders (m1, m2, M) = (-1, 1, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Bispectrum term at orders (m1, m2, M) = (1, -1, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] ... computed bispectrum from paired survey-type catalogues.
Minor differences#
For other measurement algorithms, the syntax is very similar except for a few minor differences:
For two-point statistics, the argument corresponding to
degrees
above isdegree
as there is only a single multipole degree. The argumentsform
andidx_bin
do not apply.For global plane-parallel measurements, no random catalogue is required.
For window function measurements, only the random catalogue is required.
For full details, please consult the API reference
(twopt
and threept
modules).
Results#
The returned measurement results are dictionaries containing the
raw statistic (key with suffix _raw
) without shot noise subtraction,
the shot noise (key with suffix _shot
), the bin centres for each
coordinate dimension (keys with suffix _bin
), the average/effectuve bin
coordinates (keys with suffix _eff
), and the number of contributing modes
(or analogously pairs) in each bin (key 'nmodes'
/'npairs'
).
# DEMO
from pprint import pprint
pprint(results)
{'bk_raw': array([-6.15401315e+08+3.73327376e-07j, 2.40744325e+08-1.72304943e-07j,
-2.84005114e+08-2.68329052e-07j, -1.06997676e+08+3.94865494e-08j,
1.94789754e+07+2.46790934e-08j, 8.29519632e+07+2.33248562e-08j,
-4.53112276e+06-1.97432747e-08j, -1.06546658e+08+3.58968631e-09j,
-6.84846274e+07-1.79484315e-08j, -1.14891079e+08-5.38452946e-09j]),
'bk_shot': array([ -5212454.29561975+6.86692862e-10j,
-8944908.64663568+1.20018016e-09j,
-15387868.79029875+1.95566080e-09j,
-17045970.26450608+2.15266702e-09j,
-21370062.42415106+2.44739594e-09j,
-13373351.34139536+8.49465037e-10j,
-19646302.14048065+2.28874625e-09j,
-16373586.94542564+2.02459884e-09j,
-14739949.18356972+1.79683991e-09j,
-12282473.11528281+1.49875865e-09j]),
'k1_bin': array([0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06]),
'k1_eff': array([0.06040971, 0.06040971, 0.06040971, 0.06040971, 0.06040971,
0.06040971, 0.06040971, 0.06040971, 0.06040971, 0.06040971]),
'k2_bin': array([0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ]),
'k2_eff': array([0.01149964, 0.02047114, 0.03052422, 0.04062536, 0.05033379,
0.06040971, 0.07034279, 0.08025683, 0.09014322, 0.10008191]),
'nmodes_1': array([1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896]),
'nmodes_2': array([ 56, 194, 488, 812, 1250, 1896, 2426, 3272, 4016, 5138])}
Saving to files#
In the algorithmic function for each type of measurement, if one sets
save='.txt'
or save='.npz'
, the results as a dictionary will be
automatically saved to a file in either .txt
or .npz
format.
If the paramset
argument is set to a
ParameterSet
object, the output directory
will be paramset['directories']['measurements']
(an empty output directory
path points to the current working directory), and the string
paramset['tags']['output']
will be appended to the file name before
the extension suffix.
This is demonstrated below for a global plane-parallel power spectrum measurement:
from triumvirate.twopt import compute_powspec_in_gpp_box
# DEMO
parameter_set.update(tags={'output': '_demo'})
results = compute_powspec_in_gpp_box(
catalogue_sim,
degree=0, paramset=parameter_set,
save='.txt', logger=trv_logger
)
[2023-10-04 13:37:22 (+00:00:08) INFO] Parameter set have been initialised.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Parameters validated.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Parameters validated.
[2023-10-04 13:37:22 (+00:00:08) INFO] Binning has been initialised.
[2023-10-04 13:37:22 (+00:00:08) INFO] Catalogue box has been periodised.
[2023-10-04 13:37:22 (+00:00:08) INFO] Inserted missing 'nz' field based on particle count and box size.
[2023-10-04 13:37:22 (+00:00:08) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:22 (+00:00:08) INFO C++] Catalogue loaded: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000 (source=extdata).
[2023-10-04 13:37:22 (+00:00:08) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:22 (+00:00:08) INFO C++] Extents of particle coordinates: {'x': (0.002, 999.998 | 999.996), 'y': (0.001, 999.999 | 999.998), 'z': (0.000, 1000.000 | 999.999)} (source=extdata).
[2023-10-04 13:37:22 (+00:00:08) INFO] Normalisation factors: 4.012606e-03 (particle; used), 2.859493e-03 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:22 (+00:00:08) INFO] Measuring power spectrum from a simulation-box catalogue in the global plane-parallel approximation... (entering C++)
[2023-10-04 13:37:22 (+00:00:00) STAT C++] Computing power spectrum from a periodic-box simulation-type catalogue in the global plane-parallel approximation.
[2023-10-04 13:37:22 (+00:00:08) INFO] ... measured power spectrum from a simulation-box catalogue in the global plane-parallel approximation. (exited C++)
[2023-10-04 13:37:22 (+00:00:00) STAT C++] ... computed power spectrum from a periodic-box simulation-type catalogue in the global plane-parallel approximation.
[2023-10-04 13:37:22 (+00:00:08) INFO] Measurements saved to pk0_demo.txt.
Let’s have a look at the output measurement file:
# DEMO
with open("pk0_demo.txt", 'r') as results_file:
print(results_file.read())
# Catalogue source: extdata:4670170976
# Catalogue size: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000
# Catalogue particle extents: ([0.002, 999.998], [0.001, 999.999], [0.000, 1000.000])
# Box size: [1000.000, 1000.000, 1000.000]
# Box alignment: centre
# Mesh number: [64, 64, 64]
# Mesh assignment and interlacing: tsc, False
# Normalisation factor: 4.012605716e-03 (particle)
# Normalisation factor alternatives: 4.012605716e-03 (particle), 2.859492535e-03 (mesh), 0.000000000e+00 (mesh-mixed)
# [0] k_cen, [1] k_eff, [2] nmodes, [3] Re{pk0_raw}, [4] Im{pk0_raw}, [5] Re{pk0_shot}, [6] Im{pk0_shot}
1.000000000e-02 1.149964290e-02 56 3.160464462e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
2.000000000e-02 2.047114034e-02 194 3.881291755e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
3.000000000e-02 3.052421724e-02 488 2.746437066e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
4.000000000e-02 4.062536317e-02 812 2.298652981e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
5.000000000e-02 5.033378732e-02 1250 2.007483268e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
6.000000000e-02 6.040971138e-02 1896 1.820294440e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
7.000000000e-02 7.034278912e-02 2426 1.563805032e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
8.000000000e-02 8.025683499e-02 3272 1.336728653e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
9.000000000e-02 9.014322302e-02 4016 1.174269671e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
1.000000000e-01 1.000819060e-01 5138 1.017119060e+04 0.000000000e+00 2.003148950e+03 0.000000000e+00
We see that a header with summary information about the input parameters and data as well as some intermediary results has also been included in the saved file.
Analogously, with the save='.npz'
output format, we would have
# DEMO
with np.load("pk0_demo.npz", allow_pickle=True) as results_file:
print(results_file['header'])
Catalogue source: extdata:4670170976
Catalogue size: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000
Catalogue particle extents: ([0.002, 999.998], [0.001, 999.999], [0.000, 1000.000])
Box size: [1000.000, 1000.000, 1000.000]
Box alignment: centre
Mesh number: [64, 64, 64]
Mesh assignment and interlacing: tsc, False
Normalisation factor: 4.012605716e-03 (particle)
Normalisation factor alternatives: 4.012605716e-03 (particle), 2.859492535e-03 (mesh), 0.000000000e+00 (mesh-mixed)
[0] k_cen, [1] k_eff, [2] nmodes, [3] Re{pk0_raw}, [4] Im{pk0_raw}, [5] Re{pk0_shot}, [6] Im{pk0_shot}