Skip to content

Deplete: Flux mode (mg-flux); faster collapse via SparseXSTable#3980

Open
yrrepy wants to merge 9 commits into
openmc-dev:developfrom
yrrepy:microxs_mg-flux_faster-collapse_SparseXSTable_v2-Couple
Open

Deplete: Flux mode (mg-flux); faster collapse via SparseXSTable#3980
yrrepy wants to merge 9 commits into
openmc-dev:developfrom
yrrepy:microxs_mg-flux_faster-collapse_SparseXSTable_v2-Couple

Conversation

@yrrepy

@yrrepy yrrepy commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Description

Currently when Deplete collapses mg-flux in Flux mode via from_multigroup_flux;
from_multigroup_flux acquires the CE XS on the HDF5 (ACE) grid and averages the XS points on the grid to the collapse_energies energy group structure, using a flat-in-bin approximation for each energy group. This gets us the σ_g to then collapse with the φ_g.

It does this acquisition from the CE XS per material, per nuclide, per reaction.
Group cross sections were re-derived from the continuous-energy data for every nuclide and reaction, on every material.
When depleting 10,000++ material domains (as is customary for a large R2S mesh, or other voxel-wise activation problem), this can become quite slow. Re-factoring below achieves collapse speedups of 6-60x.

This PR re-factors such that the acquisition from the CE XS is only done once, and this data populates a SparseXSTable. A new group_xs C API exposes the flux-independent group-averaged XS; it shares one kernel with collapse_rate. The SparseXSTable now populated with all the necessary σ_g for a problem (σ_g per nuclide, per reaction) and thence this matrix can be collapsed with all the φ_g of each material to form the one group collapsed cross-section.

Pseudo-code, illustrating the structure of the loops:

Existing

for material i in 1..N:                     # correct: flux φ_i differs per domain
    with TemporarySession():                # init OpenMC + reload XS data
        for nuclide in nuclides:          
            for reaction in reactions:     
                # collapse_rate re-derives σ̄_g from CE data here, every time —
                # flux-INDEPENDENT, yet recomputed for every material  ← waste
                xs[nuc, rxn] = collapse_rate(mt, T, energies, flux_i)

This PR

# STEP 1 — flux-independent averaging, done ONCE
table = _build_xs_table_ce(nuclides, reactions, energies, T)
#   for nuclide × reaction:  row = group_xs(mt, T, energies)   # σ̄_g
#   keep only non-zero rows  ->  _SparseXSTable.xs_matrix [n_pairs × G]

# STEP 2 — collapse, still per material-domain
for material i in 1..N:                     # still per-material (φ_i)
    phi      = flux_i / flux_i.sum()
    micros_i = table.collapse(phi)          # xs_matrix @ phi  ≡  Σ_g σ̄_g·φ_g  (one mat-vec)

New implementation is compatible with the Independent and Coupled Operators.
Note that the SparseXSTable is unique to a group-structure + XS temperature.

Benchmarking

I have benchmarked on the FNG model, existing method and this PR.
collapse_speed_fng.py
fomg_16k.npy.tar.gz
fng_neutron.xml, bounding_boxes.json are needed from https://zenodo.org/records/10660030
Any cross-sections.xml and chain.xml should do.

FNG: 7 material (types), 54 nuclides, 36 reactions
Using energy-strucutres: UKAEA-1102 and FOMG-16k
Varying mesh size: 10k, 20k, 40k, 60k, 80k.
Full OMP for transport, no MPI. Collapse benefits from no parallelization.
image
Speed-ups are on the order of 6x for FOMG-16k. And 20-60x for UKAEA-1102.
Collapse processing time for 40k+ materials decreases from ~30 minutes to a minute.

Microxs results here (and elsewhere in tests) are bitwise identical and unchanged by the new collapse method.

Background

This feature comes from my GENDF Isomeric workflow
https://github.com/yrrepy/openmc/tree/iso-gendf-mpi_Sync2b-bis
1284c19
4f4de43

The faster collapse in GENDF came from this SparseXSTable, not from GENDF being inherently faster to work with.

I am trying to splice off very useful features from that workflow, that will hopefully pave the way to it's integration.

Future Plans

  1. Currently Deplete Flux only does the "flat-in-bin" approximation, a follow-up PR will mirror PREPRO:
    http://redcullen1.net/homepage.new/Papers/PREPRO2023/PREPRO2023.pdf
a) Input an arbitrary tabulated linearly interpolable weighting spectrum, or,
b) “flat” or constant weight in each group, or,
c) 1/E across section entire energy range, or,
d) Maxwellian to 1/E to Fission to Constant
  1. Next up from the GENDF workflow is a global_microxs_hdf5 that is MPI optimized.
    Each MPI rank reads only its local materials slices from the microxs.hdf5 (which differs from the r2s.py microxs.hdf5). This reduces the amount of RAM per MPI rank, enabling more MPI ranks to be run in parallel. (Generally Deplete is more limited by RAM, than it is by available CPU ranks).
    https://github.com/yrrepy/openmc/tree/Enhance-Deplete-MPI
    1c0961c
    244b251

  2. GENDF isomeric XS depletion reuses this exact collapse machinery (SparseXSTable)

TLDR;

Flux-mode depletion re-derived group cross sections from CE data per material; this builds them once into a SparseXSTable and reuses them. ~6–60× faster collapse, identical results.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

yrrepy added 9 commits June 23, 2026 20:49
Add Reaction::group_xs and Nuclide::group_xs, exposed via a new
openmc_nuclide_group_xs C API, to compute the group-averaged cross
section in each energy group -- a flat-in-bin (constant intra-group
flux) average. A group cross section table can then be built once and
reused to collapse the flux across many domains.

Reaction::collapse_rate makes the same flat-in-bin assumption, so the
two share one union-grid panel-walk kernel (for_each_panel, after
NJOY/GROUPR's panel) rather than duplicating the integration loop;
collapse_rate stays bit-identical.
Wrap the openmc_nuclide_group_xs C API in openmc.lib.Nuclide.group_xs,
returning the group-averaged cross section in each energy group as a
NumPy array of length len(energy) - 1.
In get_microxs_and_flux(reaction_rate_mode='flux'), build the
group-averaged cross sections once into a _SparseXSTable via
_build_xs_table_ce and collapse every domain with a single
matrix-vector product (_collapse_fluxes), instead of re-deriving the
group cross sections from continuous-energy data for each domain.
Because the group cross sections are domain-independent -- a
flat-in-bin average of the continuous-energy data -- this is ~N times
faster for N domains and reproduces the previous result to within
floating-point summation order. The collapse loads cross sections from
the same library the model resolved, matching the transport solve.

MicroXS.from_multigroup_flux is now the single-flux case of the same
engine (_flux_collapse_micros); it additionally validates that the
flux is finite and non-negative, raising ValueError. The 'direct' mode
and reaction_rate_opts paths are untouched.

Only Independent Operator microxs, for now.
Test the new continuous-energy group cross section path:

- group_xs reproduces collapse_rate: sum_g(flux_g * group_xs_g) matches
  collapse_rate over several independent fluxes, with a scale guard that
  the values are per-group averages (barns), not the raw integral.
- _build_xs_table_ce + _collapse_fluxes (and the delegated
  from_multigroup_flux) reproduce the old per-domain collapse_rate
  result, including a threshold reaction and a zero-flux domain.
- _collapse_fluxes rejects non-finite and negative flux and returns a
  zero MicroXS for a zero-sum flux.
Overload from_multigroup_flux to accept a 2-D batch (or list of 1-D
fluxes) and build the group XS table once for all of them, returning a
list of MicroXS. Add a keyword-only cross_sections argument and resolve
the chain lazily. Delete _flux_collapse_micros and rewire
get_microxs_and_flux to call from_multigroup_flux directly.
Group materials sharing an (energy structure, temperature) and collapse
each group's fluxes in a single batched from_multigroup_flux call, so
the cross section table is built once per distinct group instead of once
per material. Output order is preserved.
Docstrings trimmed to the package's sibling density and mechanical
compactions (tuple init, conditional empty-matrix, dict-membership
cache check, single batched collapse assignment). No functional
change; the redundant collapse() length guard is dropped in favor of
numpy's own shape error.
Collapse each material's raw flux against a group cross section table
built once per distinct temperature and reused across materials and
timesteps, replacing the per-(nuclide, reaction) collapse_rate calls.
The nuclides setter clears the cache when the nuclide set changes; the
direct-tally override is unchanged.

Optional because the cross-timestep reuse assumes the tallied nuclide
set is stable. Today _get_reaction_nuclides returns a fixed set, but it
documents an intent to filter on nonzero density; with that, the set
would grow step to step and each change drops the cache back to a
once-per-step rebuild.
  Detect single vs batch from the first flux's ndim and pass the batch
  through, instead of copying the whole (N, G) array just to read ndim.
@yrrepy yrrepy marked this pull request as ready for review June 24, 2026 06:13
@yrrepy yrrepy requested a review from paulromano as a code owner June 24, 2026 06:13

@paulromano paulromano left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR @yrrepy and look forward to the next ones too! I'm on leave through the end of this week so I won't have time to look at this in full until next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants