charm_glob#
Module defining global CHarm variables.
The global variables are declared in charm_glob.h
and are initialized to safe values. Should you need to modify them for some reason, simply include charm_glob.h
(or charm.h
) in your code and set (without definition) the variables to whatever values you prefer. After that, whenever you call a CHarm routine that uses the global variables, the routine will use your own values instead of the default ones.
Warning
This module is for experienced users only. Most users should not interact with it.
Note
This documentation is written for double precision version of CHarm.
Thresholds used to compare floating point numbers
-
double charm_glob_threshold#
Threshold to judge whether two floating point numbers are equal,
100.0 * EPS
, whereEPS
is the machine epsilon of the floating point data type (float
,double
or__float128
). Experienced users may (under some very specific circumstances) need the possibility to tighten or relax the threshold.
-
double charm_glob_threshold2#
A more relaxed threshold to judge whether two floating point numbers are equal,
100000.0 * EPS
. It is used to check longitudes of custom user-defined grids for a constant longitudinal step (required by the FFT-based algorithms). To this end, we use double differences. For high-resolution grids, the longitudinal step may be very small, so that two consecutive longitudes are very similar. Since double differences increase numerical errors, the threshold needs to be sufficiently large in such situations (yet narrow enough). The current value was found to be sufficient for quadrature grids associated with degrees as high as70,000
.
Parameters defining the polar optimization
CHarm can be told to skip the computation of all spherical harmonics during spherical harmonic synthesis/analysis for which
If the parameters a1
and a2
(see charm_glob_polar_optimization_a1
and charm_glob_polar_optimization_a2
) are tuned reasonably, the polar optimization can improve the computation speed while not deteriorating the output accuracy. This is because the inequality filters out spherical harmonics that are of such small magnitudes in the polar areas that they do not contribute to the result (within the numerical precision). However, if the parameters are set unwisely, the accuracy can be compromised.
If the second polar optimization parameter charm_glob_polar_optimization_a2
is negative (which is the default), no polar optimization is applied, that is, all spherical harmonics are evaluated, regardless of their smallness in magnitude. If it is non-negative, the polar optimization is applied.
In double precision, the following values of the tuning parameters improve the computation speed and, in most cases, can safely be used without deteriorating the accuracy (Reinecke and Seljebotn, 2013):
charm_glob_polar_optimization_a1 = 100
andcharm_glob_polar_optimization_a2 = 0.01
.
In single and quadruple precision, different values of the tuning parameters are needed. Currently, we do not provide any recommendations, though.
References:
Reinecke, M., Seljebotn, D. S. (2013) Libsharp - spherical harmonic transforms revisited. Astronomy and Astrophysics 554, A112, doi: 10.1051/0004-6361/201321494.
-
unsigned long charm_glob_polar_optimization_a1#
Polar optimization parameter (default is
100
)
-
double charm_glob_polar_optimization_a2#
Polar optimization parameter (default is
-1.0
, that is, no polar optimization)
MPI specific global variables
Note
The variables that follow are available only when CHarm is compiled with the MPI support (--enable-mpi
, refer to charm_mpi for further details).
-
unsigned long charm_glob_shc_block_nmax_multiplier#
Multiplying this variable by the maximum harmonic degree of spherical harmonic coefficients enlarged by
1
specifies the maximum number of coefficients that can be exchanged between MPI processes during spherical harmonic synthesis/analysis with distributedcharm_shc
.Assume that
charm_shc.distributed = 1
. During the spherical harmonic synthesis/analysis,(charm_shc.nmax + 1) * charm_glob_shc_block_nmax_multiplier
spherical harmonic coefficients \(\bar{C}_{nm}\) and the same amount of \(\bar{S}_{nm}\) coefficients will be sent between MPI processes at most. Too small value ofcharm_glob_shc_block_nmax_multiplier
will cause too large overhead due to MPI calls. On the other hand, too large value will consume too much RAM on each MPI process.Default value is
1000
. The value must be larger than zero. If it is zero, CHarm will use the value1
.
-
size_t charm_glob_sha_block_lat_multiplier#
This variable helps to control the number of
for
loop iterations over latitudes incharm_sha_point()
. It has enormous impact on the performance ofcharm_sha_point()
when CHarm is compiled with the MPI support.Let
x
by the number of latitudes, for which thefor
loop over latitudes incharm_sha_point()
runs. Usually, this number is about half of the number of latitudes of quadrature grids due to their equatorial symmetry. Furthermore, letb
be the value ofcharm_glob_sha_block_lat_multiplier
. Finally, lets
be the size of SIMD vectors (can be determined bycharm_misc_buildopt_simd_vector_size()
) ando
be the number of OpenMP processes (if OpenMP parallelization is disabled, theno = 1
). Thefor
loop incharm_sha_point()
will run roughlyceil(x / (b * s * o))
times. Thus, by increasingb
(and optionally alsoo
), the number of loop runs can be decreased.The optimal value depends on your hardware (e.g., the number of shared-memory computing nodes, the network connection speed between the nodes, etc.) and also on the number of latitudes. Both too low and too high values may drastically deteriorate the performance.
It the value is too low, the
for
loop will run many times. This is a serious problem, because with each loop iteration, all spherical harmonic coefficients need to be distributed among all MPI processes within an MPI communicator. So by increasingcharm_glob_sha_block_lat_multiplier
, you can reduce the number of times the spherical harmonic coefficients will be sent between MPI processes which is desired in general.If the value is too high, the CPU caching within a single computing node may be significantly deteriorated, thereby decreasing the performance.
The optimum value should be determined by the trial and error method.
The parameter affects
charm_sha_point()
regardless of whether or notcharm_shc
andcharm_point
are distributed. The value must be larger than0
. If it is0
, CHarm will use the value1
.Default value is
4
.
-
size_t charm_glob_shs_block_lat_multiplier#
The same as
charm_glob_sha_block_lat_multiplier
but for spherical harmonic synthesis incharm_shs_point()
.Default value is
8
.