Deplete: Flux mode (mg-flux); faster collapse via SparseXSTable#3980
Open
yrrepy wants to merge 9 commits into
Open
Deplete: Flux mode (mg-flux); faster collapse via SparseXSTable#3980yrrepy wants to merge 9 commits into
yrrepy wants to merge 9 commits into
Conversation
Add Reaction::group_xs and Nuclide::group_xs, exposed via a new openmc_nuclide_group_xs C API, to compute the group-averaged cross section in each energy group -- a flat-in-bin (constant intra-group flux) average. A group cross section table can then be built once and reused to collapse the flux across many domains. Reaction::collapse_rate makes the same flat-in-bin assumption, so the two share one union-grid panel-walk kernel (for_each_panel, after NJOY/GROUPR's panel) rather than duplicating the integration loop; collapse_rate stays bit-identical.
Wrap the openmc_nuclide_group_xs C API in openmc.lib.Nuclide.group_xs, returning the group-averaged cross section in each energy group as a NumPy array of length len(energy) - 1.
In get_microxs_and_flux(reaction_rate_mode='flux'), build the group-averaged cross sections once into a _SparseXSTable via _build_xs_table_ce and collapse every domain with a single matrix-vector product (_collapse_fluxes), instead of re-deriving the group cross sections from continuous-energy data for each domain. Because the group cross sections are domain-independent -- a flat-in-bin average of the continuous-energy data -- this is ~N times faster for N domains and reproduces the previous result to within floating-point summation order. The collapse loads cross sections from the same library the model resolved, matching the transport solve. MicroXS.from_multigroup_flux is now the single-flux case of the same engine (_flux_collapse_micros); it additionally validates that the flux is finite and non-negative, raising ValueError. The 'direct' mode and reaction_rate_opts paths are untouched. Only Independent Operator microxs, for now.
Test the new continuous-energy group cross section path: - group_xs reproduces collapse_rate: sum_g(flux_g * group_xs_g) matches collapse_rate over several independent fluxes, with a scale guard that the values are per-group averages (barns), not the raw integral. - _build_xs_table_ce + _collapse_fluxes (and the delegated from_multigroup_flux) reproduce the old per-domain collapse_rate result, including a threshold reaction and a zero-flux domain. - _collapse_fluxes rejects non-finite and negative flux and returns a zero MicroXS for a zero-sum flux.
Overload from_multigroup_flux to accept a 2-D batch (or list of 1-D fluxes) and build the group XS table once for all of them, returning a list of MicroXS. Add a keyword-only cross_sections argument and resolve the chain lazily. Delete _flux_collapse_micros and rewire get_microxs_and_flux to call from_multigroup_flux directly.
Group materials sharing an (energy structure, temperature) and collapse each group's fluxes in a single batched from_multigroup_flux call, so the cross section table is built once per distinct group instead of once per material. Output order is preserved.
Docstrings trimmed to the package's sibling density and mechanical compactions (tuple init, conditional empty-matrix, dict-membership cache check, single batched collapse assignment). No functional change; the redundant collapse() length guard is dropped in favor of numpy's own shape error.
Collapse each material's raw flux against a group cross section table built once per distinct temperature and reused across materials and timesteps, replacing the per-(nuclide, reaction) collapse_rate calls. The nuclides setter clears the cache when the nuclide set changes; the direct-tally override is unchanged. Optional because the cross-timestep reuse assumes the tallied nuclide set is stable. Today _get_reaction_nuclides returns a fixed set, but it documents an intent to filter on nonzero density; with that, the set would grow step to step and each change drops the cache back to a once-per-step rebuild.
Detect single vs batch from the first flux's ndim and pass the batch through, instead of copying the whole (N, G) array just to read ndim.
paulromano
requested changes
Jun 25, 2026
paulromano
left a comment
Contributor
There was a problem hiding this comment.
Thanks a lot for this PR @yrrepy and look forward to the next ones too! I'm on leave through the end of this week so I won't have time to look at this in full until next week.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Currently when Deplete collapses mg-flux in Flux mode via from_multigroup_flux;
from_multigroup_flux acquires the CE XS on the HDF5 (ACE) grid and averages the XS points on the grid to the collapse_energies energy group structure, using a flat-in-bin approximation for each energy group. This gets us the σ_g to then collapse with the φ_g.
It does this acquisition from the CE XS per material, per nuclide, per reaction.
Group cross sections were re-derived from the continuous-energy data for every nuclide and reaction, on every material.
When depleting 10,000++ material domains (as is customary for a large R2S mesh, or other voxel-wise activation problem), this can become quite slow. Re-factoring below achieves collapse speedups of 6-60x.
This PR re-factors such that the acquisition from the CE XS is only done once, and this data populates a SparseXSTable. A new group_xs C API exposes the flux-independent group-averaged XS; it shares one kernel with collapse_rate. The SparseXSTable now populated with all the necessary σ_g for a problem (σ_g per nuclide, per reaction) and thence this matrix can be collapsed with all the φ_g of each material to form the one group collapsed cross-section.
Pseudo-code, illustrating the structure of the loops:
Existing
This PR
New implementation is compatible with the Independent and Coupled Operators.
Note that the SparseXSTable is unique to a group-structure + XS temperature.
Benchmarking
I have benchmarked on the FNG model, existing method and this PR.
collapse_speed_fng.py
fomg_16k.npy.tar.gz
fng_neutron.xml, bounding_boxes.json are needed from https://zenodo.org/records/10660030
Any cross-sections.xml and chain.xml should do.
FNG: 7 material (types), 54 nuclides, 36 reactions

Using energy-strucutres: UKAEA-1102 and FOMG-16k
Varying mesh size: 10k, 20k, 40k, 60k, 80k.
Full OMP for transport, no MPI. Collapse benefits from no parallelization.
Speed-ups are on the order of 6x for FOMG-16k. And 20-60x for UKAEA-1102.
Collapse processing time for 40k+ materials decreases from ~30 minutes to a minute.
Microxs results here (and elsewhere in tests) are bitwise identical and unchanged by the new collapse method.
Background
This feature comes from my GENDF Isomeric workflow
https://github.com/yrrepy/openmc/tree/iso-gendf-mpi_Sync2b-bis
1284c19
4f4de43
The faster collapse in GENDF came from this SparseXSTable, not from GENDF being inherently faster to work with.
I am trying to splice off very useful features from that workflow, that will hopefully pave the way to it's integration.
Future Plans
http://redcullen1.net/homepage.new/Papers/PREPRO2023/PREPRO2023.pdf
Next up from the GENDF workflow is a global_microxs_hdf5 that is MPI optimized.
Each MPI rank reads only its local materials slices from the microxs.hdf5 (which differs from the r2s.py microxs.hdf5). This reduces the amount of RAM per MPI rank, enabling more MPI ranks to be run in parallel. (Generally Deplete is more limited by RAM, than it is by available CPU ranks).
https://github.com/yrrepy/openmc/tree/Enhance-Deplete-MPI
1c0961c
244b251
GENDF isomeric XS depletion reuses this exact collapse machinery (SparseXSTable)
TLDR;
Flux-mode depletion re-derived group cross sections from CE data per material; this builds them once into a SparseXSTable and reuses them. ~6–60× faster collapse, identical results.
Checklist