Skip to content

refactor: consolidate grid, stretch, and body force params into derived types#1432

Open
sbryngelson wants to merge 41 commits into
masterfrom
refactor/derived-types
Open

refactor: consolidate grid, stretch, and body force params into derived types#1432
sbryngelson wants to merge 41 commits into
masterfrom
refactor/derived-types

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

@sbryngelson sbryngelson commented May 12, 2026

Summary

Consolidates several families of flat scalar parameters and module-level arrays into Fortran derived types, reducing global variable count and making structure explicit. Implements items 1, 4, 5, 6, and 7 from issue #1427, plus item 2 (CBC directional triplets, MUSCL bounds) and item 3 (L/R Riemann state triplets).

Grid coordinate arrays → type(grid_axis) (item 1)

  • x_cc, x_cb, x_cb_s (and y/z equivalents) → x%cc, x%cb, x%cb_s
  • dx, dy, dz min spacing → x%min_spacing, y%min_spacing, z%min_spacing
  • Analytical IC codegen updated to emit x%cc(i) instead of x_cc(i)

Boundary condition types → bc_dir_t + bc_side_t (item 4)

  • Introduced bc_dir_t to hold BC type (beg/end integers) separately from payload
  • Renamed bc_x%beg/endbc%x%beg/end (new compound bc_xyz_info struct)
  • Introduced bc_side_t with beg_side/end_side sub-structs, replacing 14 flat _in/_out/vb*/ve* fields
  • Updated ~100 example/benchmark case.py files and golden test files
  • Fixed remove_higher_dimensional_keys to handle bc%y%beg-style keys for lower-dimensional cases

Body force parameters → type(body_force_t) (item 5)

  • Replaced three separate bf_x, bf_y, bf_z variables with single bf struct
  • bf_x%k, bf_x%w, bf_x%p, bf_x%g, bf_x%enabled (and y/z) → bf%x%k, bf%x%w, etc.

IB dynamics → type(ib_dynamics_t) (item 6)

  • force_x/y/z, torque_x/y/z, vel_x/y/z, omega_x/y/z, angle_x/y/zforce%x/y/z, etc.

Grid stretching → type(bounds_info) (item 7)

  • x_a/x_b, y_a/y_b, z_a/z_bx_stretch%beg/end, y_stretch%beg/end, z_stretch%beg/end

CBC module directional arrays → private derived types (item 2)

  • Defined three module-private types in m_cbc.fpp:
    • cbc_rs_dir_t — 4D (:,:,:,:) triplet for reshaped primitive/flux arrays
    • cbc_fd_dir_t — 2D (:,:) triplet for finite-difference coefficients
    • cbc_pi_dir_t — 3D (:,:,:) triplet for polynomial interpolation coefficients
  • Replaced 21 flat module-level allocatable arrays with 7 struct instances:
    • q_prim_rs[x/y/z]_vfq_prim_rs%[x/y/z]
    • F_rs[x/y/z]_vf, F_src_rs[x/y/z]_vfF_rs%[x/y/z], F_src_rs%[x/y/z]
    • flux_rs[x/y/z]_vf_l, flux_src_rs[x/y/z]_vf_lflux_rs%[x/y/z], flux_src_rs%[x/y/z]
    • fd_coef_[x/y/z]fd_coef%[x/y/z]
    • pi_coef_[x/y/z]pi_coef%[x/y/z]
  • The CCE _l suffix naming workaround is naturally resolved by the struct naming
  • GPU_DECLARE uses %component member syntax, consistent with m_global_parameters.fpp

MUSCL bounds triplet → type(muscl_bounds_t) (item 2)

  • is1_muscl, is2_muscl, is3_musclis_muscl%x, is_muscl%y, is_muscl%z
  • s_muscl dummy argument list reduced from 3 int_bounds_info args to 3 renamed scalars assigned into one struct; GPU_UPDATE collapsed from 3 variables to 1

L/R Riemann state triplets → type(riemann_states_arr) (item 3)

  • Left/right primitive state arrays per direction consolidated into derived type instances in m_riemann_solvers.fpp

Other

  • Pins and applies ffmt v0.4.0 formatting

Refactoring this enables

The derived type consolidations here are load-bearing groundwork for several follow-on simplifications:

Directional sweep loops. With x/y/z triplets in structs, code that currently has three nearly-identical Fypp-generated blocks (one per direction) can be replaced by a single loop over [x, y, z] components. The MUSCL limiter and CBC sweep logic are the most immediate targets — each currently copies the same logic three times via #:for XYZ in [...].

Slimmer subroutine signatures. Every consolidated triplet removes two arguments from subroutines that previously took separate _x, _y, _z parameters. The remaining dqL/R_prim_dx/dy/dz_n triplets in m_rhs.fpp and m_riemann_solvers.fpp are the next candidate — consolidating them collapses a 3-argument pattern repeated across 7 subroutines into 1.

Simpler GPU data management. Struct-level GPU_DECLARE and GPU_UPDATE calls cover all components in one directive, eliminating the class of bug where one direction is updated but another is not. This is already visible in the MUSCL change: GPU_UPDATE(device='[is_muscl]') replaces three separate updates.

Easier extensibility. Adding a new per-direction field (e.g., a new flux component or a new BC attribute) now touches one struct definition instead of 6–10 separate declarations, allocation sites, and call sites. The bc_side_t struct is the clearest example — adding a per-face field now requires one line.

Foundation for the Riemann solver refactor (#1426). The riemann_states consolidation (item 3) is the prerequisite named in that issue. The struct layout is now in place for the interface overhaul described there.


Test plan

  • ./mfc.sh precheck passes (all 6 checks)
  • All three targets compile (pre_process, simulation, post_process)
  • GPU OpenACC build compiles (nvfortran)
  • All 14 CBC boundary condition tests pass (GPU build)
  • MUSCL limiter tests pass (1D and 2D, all limiters, int_comp) (GPU build)
  • Body force tests pass
  • Grid stretching tests pass
  • remove_higher_dimensional_keys fix verified
  • CI green

Closes part of #1427.
Closes #1441.
Partially addresses #1440 (CBC and MUSCL directional sweep deduplication).

@qodo-code-review
Copy link
Copy Markdown
Contributor

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 58.07823% with 493 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.07%. Comparing base (3538585) to head (b1ed109).

Files with missing lines Patch % Lines
src/simulation/m_cbc.fpp 50.00% 55 Missing and 15 partials ⚠️
src/common/m_boundary_common.fpp 55.92% 22 Missing and 45 partials ⚠️
src/simulation/m_bubbles_EL.fpp 30.61% 33 Missing and 1 partial ⚠️
src/post_process/m_data_output.fpp 41.81% 32 Missing ⚠️
src/pre_process/m_icpp_patches.fpp 31.91% 31 Missing and 1 partial ⚠️
src/pre_process/m_start_up.fpp 40.00% 19 Missing and 5 partials ⚠️
src/common/m_mpi_common.fpp 30.30% 15 Missing and 8 partials ⚠️
src/simulation/m_data_output.fpp 45.94% 19 Missing and 1 partial ⚠️
src/common/m_chemistry.fpp 65.38% 18 Missing ⚠️
src/simulation/m_bubbles_EL_kernels.fpp 32.00% 14 Missing and 3 partials ⚠️
... and 22 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1432      +/-   ##
==========================================
- Coverage   61.31%   61.07%   -0.24%     
==========================================
  Files          72       72              
  Lines       19771    19650     -121     
  Branches     2849     2856       +7     
==========================================
- Hits        12123    12002     -121     
- Misses       5699     5707       +8     
+ Partials     1949     1941       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace flat allocatable arrays x_cb/y_cb/z_cb, x_cc/y_cc/z_cc,
and dx/dy/dz with a derived type having .cb, .cc, and .spacing
components. All three executables (pre_process, simulation,
post_process) updated across 47 files.

Key design decisions:
- pre_process keeps scalar dx/dy/dz as minimum cell-width scalars;
  only x_cb and x_cc are folded into x%cb and x%cc
- OpenMP GPU target uses whole-struct declare target (x, y, z) since
  component-level declare target is invalid; OpenACC uses component-level
- 2dHardcodedIC.fpp wraps dx*dy in #ifdef MFC_PRE_PROCESS for the
  scalar vs per-cell context difference

Special variable collisions fixed:
- m_chemistry.fpp: local integer x/y/z -> cx/cy/cz
- m_weno.fpp: local real y(1:4) scratch -> ys
- m_viscous.fpp: local real dx(1:3) scratch -> ds
- m_ibm.fpp: local scalar dx/dy/dz -> dx_loc/dy_loc/dz_loc
- m_cbc.fpp: Fypp template d${XYZ}$ -> ${XYZ}$%spacing
Implements items 7 and 5 from issue #1427:
- x_a/x_b/y_a/y_b/z_a/z_b -> type(bounds_info) :: x_stretch, y_stretch, z_stretch
- bf_x/bf_y/bf_z + k_x/w_x/p_x/g_x (and y/z) -> type(body_force_axis) :: bf_x, bf_y, bf_z
@sbryngelson sbryngelson force-pushed the refactor/derived-types branch from 882206f to 3f9f382 Compare May 12, 2026 20:38
@sbryngelson sbryngelson force-pushed the refactor/derived-types branch from 15f3a64 to eb90ba8 Compare May 12, 2026 23:24
Remove module-level dx, dy, dz scalars from pre_process m_global_parameters.
Add min_spacing field to the grid_axis derived type so each axis carries its
own minimum cell width. Update all call sites in m_grid, m_start_up,
m_icpp_patches, m_mpi_common, and 2dHardcodedIC.
After reading grid data from files, compute and store min_spacing on each
axis in both serial and parallel paths. Matches the pre_process pattern
so min_spacing is consistent across all three executables.
Group the three directional boundary condition variables (bc_x, bc_y,
bc_z of type bc_dir_t) into a single bc_xyz_info struct accessed as
bc%x, bc%y, bc%z. Updates all Fortran source, Fypp macros, Python
toolchain, example cases, and documentation.
PR #1432 renamed bc_x%beg -> bc%x%beg. The remove_higher_dimensional_keys
helper only matched the old _y/_z separator style (.+_y, y_.+), so
bc%y%beg and bc%z%beg were not removed for lower-dimensional cases.
Add %{dim}% substring check to cover the new compound key format.
- m_thinc.fpp: take master's extended Fypp for-loop tuple (STENCIL_VAR,
  COORDS, X_BND/Y_BND/Z_BND), update CC_PRI x_cc/y_cc/z_cc -> x%cc/y%cc/z%cc
- m_rhs.fpp: take master's drop of 'dummy' workaround condition, keep bc%y%beg naming
- m_riemann_solvers.fpp: take master's unified Re_avg_rsx_vf indexing (j,k,l)
  for all cylindrical faces, update y_cb/y_cc -> y%cb/y%cc
Comment thread tests/failed_uuids.txt Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Group the three body-force axis structs (bf_x, bf_y, bf_z) into a
single body_force_t container variable bf, matching the bc%x/y/z
compound naming pattern. Updates all Fortran source, Python toolchain,
examples, and docs.
- examples/3D_rayleigh_taylor/case.py: bf_y%* -> bf%y%*
- case_validator.py: check_body_forces uses bf%{dir}%enabled/k/w/p/g
- descriptions.py: regex pattern bf_([xyz]) -> bf%([xyz])%enabled
Replace flat rho_L/rho_R, pres_L/pres_R, E_L/E_R, H_L/H_R, gamma_L/R,
pi_inf_L/R, qv_L/R, c_L/R, T_L/R, MW_L/R, R_gas_L/R, Cp_L/R, Cv_L/R,
Gamm_L/R, G_L/R, Y_L/R, vel_L_rms/R_rms, vel_L_tmp/R_tmp, nbub_L/R,
ptilde_L/R with type(riemann_states) :: rho, pres, E, H, ... accessed
as rho%L / rho%R. Replace vel_L/vel_R with type(riemann_states_vec3) ::
vel accessed as vel%L(i) / vel%R(i).

HLLD already used this pattern; HLL, LF, and HLLC now match.
GPU private lists updated to name the struct once instead of _L and _R.
Cray ftn (CCE) requires globals accessed in GPU accelerator routines to
have ! declare directives. bc (type bc_xyz_info) was declared for
OpenMP but omitted from the OpenACC block, causing:

  ftn-7066 ftn: ERROR S_SLIP_WALL: Global in accelerator routine
  without declare -- bc

Add $:GPU_DECLARE(create='[bc]') to the MFC_OpenACC branch alongside
the existing OpenMP declaration.
…states structs

Extend the riemann_states refactor to cover wave speeds, intermediate
states, stress tensor, Reynolds numbers, and HLLD state vectors:

  s_L/R         -> s%L/R     (type riemann_states)
  Ms_L/R        -> Ms%L/R
  pres_SL/SR    -> pres_S%L/R
  flux_tau_L/R  -> flux_tau%L/R
  xi_L/R        -> xi%L/R    (HLLC)
  xi_L/R_m1     -> xi_m1%L/R (HLLC)
  pTot_L/R      -> pTot%L/R  (HLLD)
  rhoL/R_star   -> rho_star%L/R (HLLD)
  E_starL/R     -> E_star%L/R (HLLD)
  vL/R_star     -> v_star%L/R (HLLD)
  wL/R_star     -> w_star%L/R (HLLD)
  tau_e_L/R(i)  -> tau_e%L/R(i) (type riemann_states_arr6)
  Re_L/R(i)     -> Re%L/R(i)   (type riemann_states_arr2)
  xi_field_L/R  -> xi_field%L/R (riemann_states_vec3)
  U_L/R etc.    -> U%L/R etc.  (type riemann_states_arr7, HLLD)

Variable-size arrays (alpha_L/R, Ys_L/R, etc.) are left as-is since
they require allocatable components or Fypp-conditional sizing.

Also exclude fp-stability-logs/ and scripts/ from typos spell check.
type(riemann_states) structs are not auto-private under OpenACC (unlike
plain real(wp) scalars). flux_tau_L/flux_tau_R were scalars on master and
worked without explicit listing; after the riemann_states refactor they
became flux_tau (a derived type) which requires explicit private=[flux_tau].
HLLC had it; LF was missed.
This file is auto-generated by the test runner to track failures for CI
retry logic. It should never be committed.
…t struct

Replace 6 module-level vector_field arrays (dqL/R_prim_dx/dy/dz_n) with
2 dq_prim_dir_t struct instances (dqL_prim_n, dqR_prim_n) whose %x/%y/%z
members hold the same data. Subroutine signatures continue to receive
individual allocatable arrays to avoid GPU illegal-address errors that
arise when a non-allocatable struct is passed as a dummy arg to kernels.
…ine struct member access

OpenMP GPU device subroutines cannot reliably access grid_axis struct member
allocatables (x%spacing, x%cc, x%cb) through the device descriptor. This
caused A57E30FE (3D Viscous IGR Jacobi) to fail with CUDA_ERROR_ILLEGAL_ADDRESS.

Add flat allocatable arrays dx/dy/dz, x_cc/y_cc/z_cc, x_cb/y_cb/z_cb as
GPU-accessible aliases in m_global_parameters.fpp. Replace all struct member
accesses in simulation GPU kernel files with these flat arrays.

Two sync points are required:
1. Early HOST sync in s_initialize_modules after s_populate_grid_variables_buffers,
   before WENO/IGR/CBC/Riemann module initializations that use grid values
   (e.g. s_initialize_weno_module uses s_cb => x_cb for WENO polynomial coefficients)
2. GPU_UPDATE sync in s_initialize_gpu_vars to copy flat arrays to device

Regenerate 16 IGR golden files for nvfortran OpenMP GPU build on wingtip-gpu3.
…e access

Change grid_axis cb/cc/spacing from allocatable to pointer components backed
by the flat module arrays (x_cb/x_cc/dx etc.). GPU pointer attachment via
GPU_ENTER_DATA(attach=) updates the device struct's pointer fields to point
to the already-mapped device flat arrays, fixing CUDA_ERROR_ILLEGAL_ADDRESS
in m_igr.fpp inline GPU_PARALLEL_LOOP bodies on NVHPC OpenMP target offload.

Eliminates the early host sync, the duplicate GPU_UPDATE for struct members,
and the OpenACC/OpenMP split in GPU_DECLARE for x/y/z.
Replace flat grid array references (dx/dy/dz, x_cc/y_cc/z_cc, x_cb/y_cb/z_cb)
with struct member access (x%spacing/y%spacing/z%spacing, x%cc/y%cc/z%cc,
x%cb/y%cb/z%cb) throughout simulation kernel files.

m_igr.fpp retains flat array access: NVHPC OpenMP target offload does not
correctly resolve declare-target struct pointer components in that file's
inline GPU_PARALLEL_LOOP bodies, causing CUDA_ERROR_LAUNCH_FAILED.
…inter-in-atomic-region bug

NVHPC cannot correctly dereference declare-target struct pointer components
(x%spacing, y%spacing, z%spacing) inside GPU_ATOMIC blocks in m_igr.fpp.
Introduce module-level inv_dx/inv_dy/inv_dz arrays precomputed from the
flat spacing arrays and GPU-updated once at init. All GPU_ATOMIC and
GPU_PARALLEL_LOOP bodies in this file now use inv_d*(j/k/l) instead of
1._wp/d*(j/k/l), eliminating the pointer indirection that triggers the
compiler bug and also removing repeated divisions from the hot atomic path.

CPU-only alf_igr computation updated to x%spacing(1) / y%spacing(1) /
z%spacing(1) as the struct-member access works correctly on the host.
Gamm_L/Gamm_R were consolidated into type(riemann_states) :: Gamm when
the riemann_states refactor landed. The HLLC private list was updated but
the LF solver (s_lf_riemann_solver) private list was missed, causing
ACC find_in_present_table failures for Gamm on Cray OpenACC (Frontier).
…er dereference

x%cc/x%spacing/etc. inside GPU_PARALLEL_LOOP fail on NVHPC and AMD OpenMP target
because map(always,to: x%cc) does not correctly update the device struct pointer to
point to device data, causing CUDA_ERROR_ILLEGAL_ADDRESS (Phoenix gpu-omp UUID
AA49A8BC) and HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION (Frontier AMD gpu-omp
UUID 2ADA983F). Revert to flat declare-target arrays (dx/dy/dz, x_cc/y_cc/z_cc)
which are directly resolved on device.
…ran pointer attachment

map(always,to/from:) for pointer components copies the host descriptor
(with host addresses) to device, leaving device struct pointers invalid.
OpenMP 5.1 map(attach:) correctly looks up the device address of the pointee
and updates the device struct pointer to reference device memory.
map(detach:) is the symmetric reverse.

This fixes CUDA_ERROR_ILLEGAL_ADDRESS (Phoenix gpu-omp) and
HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION (Frontier AMD gpu-omp) caused by
x%cc/x%spacing etc. being dereferenced as host addresses inside GPU kernels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate dqL/R_prim_dx/dy/dz_n triplets into a single derived type argument across m_rhs and m_riemann_solvers

2 participants