.. _ram-estimate: The RAM estimator ================== Large 3D models can require a substantial amount of memory, and finding this out only after a job has been queued (or has crashed out-of-memory) is costly. ``ram_estimate.py``, located in the repository root next to ``params.dat``, estimates the peak RAM a 3D model will need **before you run it**, so a job can be sized correctly for a cluster or workstation ahead of time. The script reads your ``params.dat`` (coolant list, HEALPix level) and ``src/config.mk`` (``RAYTHEIA`` mode, ``NETWORK``, ``CHEMANALYSIS``, and the other compile flags that affect the per-cell memory layout), fetches ``NLEV``/``NTEMP`` from each coolant's LAMDA file, and reports a per-component RAM breakdown for a grid of the resolution you specify. It is a plain Python 3 script with no dependencies beyond the standard library (``gfortran`` is used opportunistically, to measure exact Fortran struct sizes — see ``--no-probe`` below). Usage ----- .. code-block:: console $ python3 ram_estimate.py --res 256 --raytheia 1 --lmax 13 ``params.dat`` in the current directory is read by default; pass ``--params /path/to/params.dat`` to point at a different model. ``--res N`` (required) sets the grid resolution (:math:`N^3` cells). ``--raytheia``, ``--lmax``, and the chemistry network are normally read automatically from ``src/config.mk``, but can all be overridden on the command line to explore "what if" scenarios — for example, checking the RAM at a higher HEALPix level, or with :ref:`levmax-flag` capping the number of coolant energy levels, before committing to a run. Useful options: .. list-table:: :header-rows: 1 :widths: 15 40 * - Option - Purpose * - ``--res N`` - Grid resolution, :math:`N^3` cells (required). * - ``--params FILE`` - Path to ``params.dat`` (default: ``params.dat``). * - ``--healpix L`` - HEALPix level override (default: read from ``params.dat``). * - ``--lmax N`` - Cap energy levels per coolant, matching the runtime ``-lmax=N`` flag (default: no cap). See :ref:`levmax-flag`. * - ``--raytheia {0,1,2}`` - Ray-tracing mode matching ``config.mk RAYTHEIA=`` (default: read from ``config.mk``). * - ``--threads T`` - OpenMP thread count, for the peak temporary (``evalpop``) estimate. * - ``--chemanalysis {0,1}`` - Override the ``CHEMANALYSIS`` flag (default: read from ``config.mk``). * - ``--srcdir DIR`` - Path to ``src/``, used to locate ``config.mk`` and to measure exact struct sizes (default: ``/src``). * - ``--no-probe`` - Skip compiling the struct-size probe with ``gfortran``; use a rougher fallback instead. Run ``python3 ram_estimate.py --help`` for the full list of options. What it reports ---------------- For the requested resolution, the tool prints: - A per-coolant table of energy levels used (after any ``-lmax`` cap). - A persistent-RAM breakdown across every component that scales with grid size: coolant level-population arrays, ray-tracing path arrays, chemical abundances, the Fortran struct and pointer-descriptor overhead (measured directly via ``gfortran`` when available), static LAMDA data, and — when relevant — the ``CHEMANALYSIS`` ``temp_rate(nreac,pdr_ptot)`` array, which is disproportionately large and is explicitly flagged as "not recommended for 3D models" in ``config.mk`` (see :doc:`makefile`). - The peak *temporary* RAM used during cooling-function evaluation (``evalpop``, scaled by ``--threads``). - A HEALPix-level comparison table, since ray-tracing memory scales as :math:`12 \times 4^L` per cell and is usually the first thing worth reconsidering if a model does not fit in the available memory. Notes on accuracy ----------------- The estimate is derived directly from the Fortran source (``initialization.F90``, ``allocations.F90``, ``modules.F90``, ``3DPDR.F90``), not from a fixed formula, so it stays correct as the code evolves: - Per-cell struct overhead (the array descriptors inside the internal grid-point and coolant data structures) is measured by compiling a small probe against your actual ``src/modules.F90`` with the same preprocessor flags as your build, rather than guessed. - The script inspects ``3DPDR.F90`` to check whether the per-cell cooling/heating arrays are allocated efficiently, and adjusts the estimate accordingly (overridable with ``--coolheat-fixed``). .. note:: Expect the reported total to be within a few percent of observed usage. The remainder is typically OS/allocator overhead (heap alignment and bookkeeping across the many small per-cell allocations) that isn't practical to model exactly, plus MPI buffers, Fortran runtime overhead, and CVODE solver workspace — none of which are included in the estimate.