Clustering Measurements#

Triumvirate provides the algorithms for computing clustering statistics in both Fourier and configuration space and in both local and global plane-parallel approximations (see Background for details).

The usage of these measurement algorithms is all very similar, so as an example we will mainly consider the bispectrum measurement below, and briefly mention the differences for other measurements.

from triumvirate.threept import compute_bispec

Ingredients#

There are a number of inputs for measurements:

We will reuse trv_logger, parameter_set and binning created in the Customised Logger, Parameter Set and Binnig Scheme tutorials as inputs.

In addition, we have used nbodykit to produce three types of mock catalogues:

  • The first is a simulation-like log-normal catalogue catalogue_sim in a cubic box of size \(L = 1000\,h^{-1}\,\mathrm{Mpc}\) with number density \(\bar{n} = 5 \times 10^{-4} \,h^3\,\mathrm{Mpc}^{-3}\). The input cosmological parameters are \(h = 0.6736, \Omega_{\mathrm{CDM},0} = 0.2645, \Omega_{\mathrm{b},0} = 0.04930, A_s = 2.083 \times 10^{-9}\) and \(n_s = 0.9649\), and the linear power spectrum at redshift \(z = 1\) with linear tracer bias \(b_1 = 2\) is used.

  • The second is a survey-like catalogue catalogue_survey based on the simulation-like one, with the catalogue cut to the inscribing sphere of radius \(L/2\) inside the cubic box.

  • The third is a uniform random catalogue catalogue_rand with number density \(5 \bar{n}\) in the same spherical volume as the survey-like one.

Following the Particle Catalogue tutorial, these catalogues are instantiated as ParticleCatalogue.

Measurements#

Having specified all the inputs, measurements can be made by simply passing them as arguments to the relevant function:

results = compute_bispec(
    catalogue_survey, catalogue_rand,
    paramset=parameter_set,
    logger=trv_logger
)
[2023-10-04 13:37:15 (+00:00:01) INFO] Parameter set have been initialised.
[2023-10-04 13:37:15 (+00:00:01) STAT C++] Parameters validated.
[2023-10-04 13:37:15 (+00:00:01) INFO] Binning has been initialised.
[2023-10-04 13:37:15 (+00:00:01) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:15 (+00:00:01) INFO] Catalogues have been aligned.
[2023-10-04 13:37:15 (+00:00:01) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:15 (+00:00:01) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:15 (+00:00:01) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:16 (+00:00:02) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:16 (+00:00:02) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.550956e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:16 (+00:00:02) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:17 (+00:00:03) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:16 (+00:00:02) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:17 (+00:00:03) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:17 (+00:00:03) STAT C++] ... computed bispectrum from paired survey-type catalogues.

Specifying lines of sight#

In the case above, the lines of sight are computed automatically, but one could supply external data arrays as replacements:

# import numpy as np
results = compute_bispec(
    catalogue_survey, catalogue_rand,
    los_data=np.ones((len(catalogue_survey), 3)),
    los_rand=np.ones((len(catalogue_rand), 3)),
    paramset=parameter_set,
    logger=trv_logger
)
[2023-10-04 13:37:17 (+00:00:03) INFO] Parameter set have been initialised.
[2023-10-04 13:37:17 (+00:00:03) STAT C++] Parameters validated.
[2023-10-04 13:37:17 (+00:00:03) INFO] Binning has been initialised.
[2023-10-04 13:37:17 (+00:00:03) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:17 (+00:00:03) INFO] Catalogues have been aligned.
[2023-10-04 13:37:17 (+00:00:03) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:17 (+00:00:03) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:17 (+00:00:03) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:18 (+00:00:04) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:18 (+00:00:04) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.550956e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:18 (+00:00:04) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:19 (+00:00:05) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:19 (+00:00:05) STAT C++] ... computed bispectrum from paired survey-type catalogues.

Substituting for parameter set#

One could also override/bypass paramset by passing the relevant/required keyword arguments. In the example below, we directly set the bispectrum multipole degrees and form, the binning and the mesh assignment parameters without a paramset argument; if the paramset argument was set, its entries would be overriden by these keyword arguments.

# DEMO
# import warnings
warnings.filterwarnings('ignore', message=".*default values are unchanged.*")

results = compute_bispec(
    catalogue_survey, catalogue_rand,
    degrees=(1, 1, 0),
    binning=binning,
    form='row',
    idx_bin=5,
    sampling_params={
        'assignment': 'cic',
        'boxsize': [1000.,]*3,
        'ngrid': [64,]*3
    },
    logger=trv_logger
)
[2023-10-04 13:37:19 (+00:00:05) INFO] Validating parameters... (entering C++)
[2023-10-04 13:37:19 (+00:00:05) INFO] ... validated parameters. (exited C++)
[2023-10-04 13:37:19 (+00:00:05) STAT C++] Parameters validated.
[2023-10-04 13:37:19 (+00:00:05) INFO] Parameter set have been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Binning has been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Lines of sight have been initialised.
[2023-10-04 13:37:19 (+00:00:05) INFO] Catalogues have been aligned.
[2023-10-04 13:37:19 (+00:00:05) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Catalogue loaded: ntotal = 259444, wtotal = 259444.000, wstotal = 259444.000 (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Extents of particle coordinates: {'x': (2.438, 998.884 | 996.446), 'y': (0.879, 998.351 | 997.472), 'z': (0.364, 998.843 | 998.478)} (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Catalogue loaded: ntotal = 1308287, wtotal = 1308287.000, wstotal = 1308287.000 (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO C++] Extents of particle coordinates: {'x': (0.604, 999.396 | 998.792), 'y': (0.630, 999.370 | 998.740), 'z': (0.426, 999.574 | 999.147)} (source=extdata).
[2023-10-04 13:37:20 (+00:00:06) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:20 (+00:00:06) INFO] Alpha contrast: 1.983082e-01.
[2023-10-04 13:37:20 (+00:00:07) INFO] Normalisation factors: 1.541759e+01 (particle; used), 1.480691e+01 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:20 (+00:00:07) INFO] Measuring bispectrum from paired survey-type catalogues in the local plane-parallel approximation... (entering C++)
[2023-10-04 13:37:21 (+00:00:07) STAT C++] Computing bispectrum from paired survey-type catalogues...
[2023-10-04 13:37:22 (+00:00:08) INFO] ... measured bispectrum from paired survey-type catalogues in the local plane-parallel approximation. (exited C++)
[2023-10-04 13:37:21 (+00:00:07) STAT C++] Bispectrum term at orders (m1, m2, M) = (-1, 1, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Bispectrum term at orders (m1, m2, M) = (0, 0, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Bispectrum term at orders (m1, m2, M) = (1, -1, 0) computed.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] ... computed bispectrum from paired survey-type catalogues.

Minor differences#

For other measurement algorithms, the syntax is very similar except for a few minor differences:

  • For two-point statistics, the argument corresponding to degrees above is degree as there is only a single multipole degree. The arguments form and idx_bin do not apply.

  • For global plane-parallel measurements, no random catalogue is required.

  • For window function measurements, only the random catalogue is required.

For full details, please consult the API reference (twopt and threept modules).

Results#

The returned measurement results are dictionaries containing the raw statistic (key with suffix _raw) without shot noise subtraction, the shot noise (key with suffix _shot), the bin centres for each coordinate dimension (keys with suffix _bin), the average/effectuve bin coordinates (keys with suffix _eff), and the number of contributing modes (or analogously pairs) in each bin (key 'nmodes'/'npairs').

# DEMO
from pprint import pprint
pprint(results)
{'bk_raw': array([-6.15401315e+08+3.73327376e-07j,  2.40744325e+08-1.72304943e-07j,
       -2.84005114e+08-2.68329052e-07j, -1.06997676e+08+3.94865494e-08j,
        1.94789754e+07+2.46790934e-08j,  8.29519632e+07+2.33248562e-08j,
       -4.53112276e+06-1.97432747e-08j, -1.06546658e+08+3.58968631e-09j,
       -6.84846274e+07-1.79484315e-08j, -1.14891079e+08-5.38452946e-09j]),
 'bk_shot': array([ -5212454.29561975+6.86692862e-10j,
        -8944908.64663568+1.20018016e-09j,
       -15387868.79029875+1.95566080e-09j,
       -17045970.26450608+2.15266702e-09j,
       -21370062.42415106+2.44739594e-09j,
       -13373351.34139536+8.49465037e-10j,
       -19646302.14048065+2.28874625e-09j,
       -16373586.94542564+2.02459884e-09j,
       -14739949.18356972+1.79683991e-09j,
       -12282473.11528281+1.49875865e-09j]),
 'k1_bin': array([0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06]),
 'k1_eff': array([0.06040971, 0.06040971, 0.06040971, 0.06040971, 0.06040971,
       0.06040971, 0.06040971, 0.06040971, 0.06040971, 0.06040971]),
 'k2_bin': array([0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ]),
 'k2_eff': array([0.01149964, 0.02047114, 0.03052422, 0.04062536, 0.05033379,
       0.06040971, 0.07034279, 0.08025683, 0.09014322, 0.10008191]),
 'nmodes_1': array([1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896, 1896]),
 'nmodes_2': array([  56,  194,  488,  812, 1250, 1896, 2426, 3272, 4016, 5138])}

Saving to files#

In the algorithmic function for each type of measurement, if one sets save='.txt' or save='.npz', the results as a dictionary will be automatically saved to a file in either .txt or .npz format.

If the paramset argument is set to a ParameterSet object, the output directory will be paramset['directories']['measurements'] (an empty output directory path points to the current working directory), and the string paramset['tags']['output'] will be appended to the file name before the extension suffix.

This is demonstrated below for a global plane-parallel power spectrum measurement:

from triumvirate.twopt import compute_powspec_in_gpp_box

# DEMO
parameter_set.update(tags={'output': '_demo'})

results = compute_powspec_in_gpp_box(
    catalogue_sim,
    degree=0, paramset=parameter_set,
    save='.txt', logger=trv_logger
)
[2023-10-04 13:37:22 (+00:00:08) INFO] Parameter set have been initialised.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Parameters validated.
[2023-10-04 13:37:22 (+00:00:08) STAT C++] Parameters validated.
[2023-10-04 13:37:22 (+00:00:08) INFO] Binning has been initialised.
[2023-10-04 13:37:22 (+00:00:08) INFO] Catalogue box has been periodised.
[2023-10-04 13:37:22 (+00:00:08) INFO] Inserted missing 'nz' field based on particle count and box size.
[2023-10-04 13:37:22 (+00:00:08) INFO] Preparing catalogue for clustering algorithm... (entering C++)
[2023-10-04 13:37:22 (+00:00:08) INFO C++] Catalogue loaded: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000 (source=extdata).
[2023-10-04 13:37:22 (+00:00:08) INFO] ... prepared catalogue for clustering algorithm. (exited C++)
[2023-10-04 13:37:22 (+00:00:08) INFO C++] Extents of particle coordinates: {'x': (0.002, 999.998 | 999.996), 'y': (0.001, 999.999 | 999.998), 'z': (0.000, 1000.000 | 999.999)} (source=extdata).
[2023-10-04 13:37:22 (+00:00:08) INFO] Normalisation factors: 4.012606e-03 (particle; used), 2.859493e-03 (mesh), 0.000000e+00 (mesh-mixed; n/a).
[2023-10-04 13:37:22 (+00:00:08) INFO] Measuring power spectrum from a simulation-box catalogue in the global plane-parallel approximation... (entering C++)
[2023-10-04 13:37:22 (+00:00:00) STAT C++] Computing power spectrum from a periodic-box simulation-type catalogue in the global plane-parallel approximation.
[2023-10-04 13:37:22 (+00:00:08) INFO] ... measured power spectrum from a simulation-box catalogue in the global plane-parallel approximation. (exited C++)
[2023-10-04 13:37:22 (+00:00:00) STAT C++] ... computed power spectrum from a periodic-box simulation-type catalogue in the global plane-parallel approximation.
[2023-10-04 13:37:22 (+00:00:08) INFO] Measurements saved to pk0_demo.txt.

Let’s have a look at the output measurement file:

# DEMO
with open("pk0_demo.txt", 'r') as results_file:
    print(results_file.read())
# Catalogue source: extdata:4670170976
# Catalogue size: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000
# Catalogue particle extents: ([0.002, 999.998], [0.001, 999.999], [0.000, 1000.000])
# Box size: [1000.000, 1000.000, 1000.000]
# Box alignment: centre
# Mesh number: [64, 64, 64]
# Mesh assignment and interlacing: tsc, False
# Normalisation factor: 4.012605716e-03 (particle)
# Normalisation factor alternatives: 4.012605716e-03 (particle), 2.859492535e-03 (mesh), 0.000000000e+00 (mesh-mixed)
# [0] k_cen, [1] k_eff, [2] nmodes, [3] Re{pk0_raw}, [4] Im{pk0_raw}, [5] Re{pk0_shot}, [6] Im{pk0_shot}
1.000000000e-02	1.149964290e-02	        56	 3.160464462e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
2.000000000e-02	2.047114034e-02	       194	 3.881291755e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
3.000000000e-02	3.052421724e-02	       488	 2.746437066e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
4.000000000e-02	4.062536317e-02	       812	 2.298652981e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
5.000000000e-02	5.033378732e-02	      1250	 2.007483268e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
6.000000000e-02	6.040971138e-02	      1896	 1.820294440e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
7.000000000e-02	7.034278912e-02	      2426	 1.563805032e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
8.000000000e-02	8.025683499e-02	      3272	 1.336728653e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
9.000000000e-02	9.014322302e-02	      4016	 1.174269671e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00
1.000000000e-01	1.000819060e-01	      5138	 1.017119060e+04	 0.000000000e+00	 2.003148950e+03	 0.000000000e+00

We see that a header with summary information about the input parameters and data as well as some intermediary results has also been included in the saved file.

Analogously, with the save='.npz' output format, we would have

# DEMO
with np.load("pk0_demo.npz", allow_pickle=True) as results_file:
    print(results_file['header'])
Catalogue source: extdata:4670170976
Catalogue size: ntotal = 499214, wtotal = 499214.000, wstotal = 499214.000
Catalogue particle extents: ([0.002, 999.998], [0.001, 999.999], [0.000, 1000.000])
Box size: [1000.000, 1000.000, 1000.000]
Box alignment: centre
Mesh number: [64, 64, 64]
Mesh assignment and interlacing: tsc, False
Normalisation factor: 4.012605716e-03 (particle)
Normalisation factor alternatives: 4.012605716e-03 (particle), 2.859492535e-03 (mesh), 0.000000000e+00 (mesh-mixed)
[0] k_cen, [1] k_eff, [2] nmodes, [3] Re{pk0_raw}, [4] Im{pk0_raw}, [5] Re{pk0_shot}, [6] Im{pk0_shot}