Skip to content

fix: Fracture/3D cell co-location in parallel mesh redistribution#4008

Open
bd713 wants to merge 13 commits intodevelopfrom
fix/bd713/fractureDistribution
Open

fix: Fracture/3D cell co-location in parallel mesh redistribution#4008
bd713 wants to merge 13 commits intodevelopfrom
fix/bd713/fractureDistribution

Conversation

@bd713
Copy link
Copy Markdown
Contributor

@bd713 bd713 commented Mar 30, 2026

Replaces #3985

@bd713 bd713 self-assigned this Mar 30, 2026
@bd713 bd713 added the type: bug Something isn't working label Mar 30, 2026
@bd713 bd713 added ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run integrated tests Allows to run the integrated tests in GEOS CI ci: run code coverage enables running of the code coverage CI jobs labels Mar 30, 2026
@OmarDuran OmarDuran requested a review from jafranc as a code owner May 4, 2026 19:03
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 50.19405% with 385 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@506fae7). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...nents/mesh/generators/VTKSuperCellPartitioning.cpp 39.38% 257 Missing ⚠️
...rc/coreComponents/mesh/generators/VTKUtilities.cpp 63.30% 102 Missing ⚠️
...reComponents/mesh/generators/ParMETISInterface.cpp 0.00% 18 Missing ⚠️
...coreComponents/mesh/generators/CollocatedNodes.cpp 38.46% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #4008   +/-   ##
==========================================
  Coverage           ?   64.50%           
==========================================
  Files              ?     1384           
  Lines              ?   123603           
  Branches           ?        0           
==========================================
  Hits               ?    79729           
  Misses             ?    43874           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@jafranc jafranc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great ! Thanks, some shortening suggestion here and there

for( vtkIdType const * q: quad )
{
fracture->InsertNextCell( VTK_QUAD, numPoints, q );
fracture->InsertNextCell( VTK_QUAD, 4, q );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it weird to hardcode there and still use numPoints above ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made more explicit

}

array1d< real_t > tpwgts( numParts );
tpwgts.setValues< serialPolicy >( 1.0f / static_cast< real_t >( numParts ) );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't parmetis dividing in equal weights if NULL is pass as tpwgts ? If so, it might avoid a check for zero-div here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Tried this, but it makes the unit tests crash.
But will protect the division.

Comment on lines +26 to +70
CollocatedNodes::CollocatedNodes( string const & faceBlockName,
vtkSmartPointer< vtkDataSet > faceMesh )
vtkSmartPointer< vtkDataSet > faceMesh,
bool performGlobalCheck )
{
// The vtk field to the collocated nodes for fractures.
string const COLLOCATED_NODES = "collocated_nodes";

vtkIdTypeArray const * collocatedNodes = vtkIdTypeArray::FastDownCast( faceMesh->GetPointData()->GetArray( COLLOCATED_NODES.c_str() ) );

// Depending on the parallel split, the vtk face mesh may be empty on a rank.
// In that case, vtk will not provide any field for the emtpy mesh.
// Therefore, not finding the duplicated nodes field on a rank cannot be interpreted as a globally missing field.
// Converting the address into an integer and exchanging it through the MPI ranks let us find out
// if the field is globally missing or not.
std::uintptr_t const address = MpiWrapper::max( reinterpret_cast< std::uintptr_t >(collocatedNodes) );
if( address == 0 )
if( performGlobalCheck )
{
GEOS_LOG_RANK_0( "Available point data fields in '" << faceBlockName << "':" );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
// Depending on the parallel split, the vtk face mesh may be empty on a rank.
// In that case, vtk will not provide any field for the empty mesh.
// Therefore, not finding the duplicated nodes field on a rank cannot be interpreted as a globally missing field.
// Converting the address into an integer and exchanging it through the MPI ranks let us find out
// if the field is globally missing or not.
std::uintptr_t const address = MpiWrapper::max( reinterpret_cast< std::uintptr_t >(collocatedNodes) );
if( address == 0 )
{
GEOS_LOG_RANK_0( " - " << faceMesh->GetPointData()->GetArrayName( i ) << " of type '" << faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() << "'" );
GEOS_LOG_RANK_0( GEOS_FMT( "Available point data fields in '{}':", faceBlockName ) );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
{
GEOS_LOG_RANK_0( GEOS_FMT( " - {} of type '{}'",
faceMesh->GetPointData()->GetArrayName( i ),
faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() ) );
}
GEOS_ERROR( GEOS_FMT( "Could not find valid field \"{}\" for fracture \"{}\".",
COLLOCATED_NODES,
faceBlockName ) );
}
}
else
{
if( !collocatedNodes )
{
GEOS_LOG_RANK_0( GEOS_FMT( "Available point data fields in '{}':", faceBlockName ) );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
{
GEOS_LOG_RANK_0( GEOS_FMT( " - {} of type '{}'",
faceMesh->GetPointData()->GetArrayName( i ),
faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() ) );
}
GEOS_ERROR( GEOS_FMT( "Could not find valid field \"{}\" for fracture \"{}\" on this rank.",
COLLOCATED_NODES,
faceBlockName ) );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CollocatedNodes::CollocatedNodes( string const & faceBlockName,
vtkSmartPointer< vtkDataSet > faceMesh )
vtkSmartPointer< vtkDataSet > faceMesh,
bool performGlobalCheck )
{
// The vtk field to the collocated nodes for fractures.
string const COLLOCATED_NODES = "collocated_nodes";
vtkIdTypeArray const * collocatedNodes = vtkIdTypeArray::FastDownCast( faceMesh->GetPointData()->GetArray( COLLOCATED_NODES.c_str() ) );
// Depending on the parallel split, the vtk face mesh may be empty on a rank.
// In that case, vtk will not provide any field for the emtpy mesh.
// Therefore, not finding the duplicated nodes field on a rank cannot be interpreted as a globally missing field.
// Converting the address into an integer and exchanging it through the MPI ranks let us find out
// if the field is globally missing or not.
std::uintptr_t const address = MpiWrapper::max( reinterpret_cast< std::uintptr_t >(collocatedNodes) );
if( address == 0 )
if( performGlobalCheck )
{
GEOS_LOG_RANK_0( "Available point data fields in '" << faceBlockName << "':" );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
// Depending on the parallel split, the vtk face mesh may be empty on a rank.
// In that case, vtk will not provide any field for the empty mesh.
// Therefore, not finding the duplicated nodes field on a rank cannot be interpreted as a globally missing field.
// Converting the address into an integer and exchanging it through the MPI ranks let us find out
// if the field is globally missing or not.
std::uintptr_t const address = MpiWrapper::max( reinterpret_cast< std::uintptr_t >(collocatedNodes) );
if( address == 0 )
{
GEOS_LOG_RANK_0( " - " << faceMesh->GetPointData()->GetArrayName( i ) << " of type '" << faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() << "'" );
GEOS_LOG_RANK_0( GEOS_FMT( "Available point data fields in '{}':", faceBlockName ) );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
{
GEOS_LOG_RANK_0( GEOS_FMT( " - {} of type '{}'",
faceMesh->GetPointData()->GetArrayName( i ),
faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() ) );
}
GEOS_ERROR( GEOS_FMT( "Could not find valid field \"{}\" for fracture \"{}\".",
COLLOCATED_NODES,
faceBlockName ) );
}
}
else
{
if( !collocatedNodes )
{
GEOS_LOG_RANK_0( GEOS_FMT( "Available point data fields in '{}':", faceBlockName ) );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
{
GEOS_LOG_RANK_0( GEOS_FMT( " - {} of type '{}'",
faceMesh->GetPointData()->GetArrayName( i ),
faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() ) );
}
GEOS_ERROR( GEOS_FMT( "Could not find valid field \"{}\" for fracture \"{}\" on this rank.",
COLLOCATED_NODES,
faceBlockName ) );
CollocatedNodes::CollocatedNodes( string const & faceBlockName,
vtkSmartPointer< vtkDataSet > faceMesh,
bool performGlobalCheck )
{
string const COLLOCATED_NODES = "collocated_nodes";
vtkIdTypeArray const * collocatedNodes = vtkIdTypeArray::FastDownCast(
faceMesh->GetPointData()->GetArray( COLLOCATED_NODES.c_str() ) );
// Determine whether the field is missing:
// - if globalCheck: reduce across ranks so an empty-mesh detection is any ranks
bool const isMissing = performGlobalCheck
? ( MpiWrapper::max( reinterpret_cast< std::uintptr_t >( collocatedNodes ) ) == 0 )
: ( collocatedNodes == nullptr );
if( isMissing )
{
GEOS_LOG_RANK_0( GEOS_FMT( "Available point data fields in '{}':", faceBlockName ) );
for( int i = 0; i < faceMesh->GetPointData()->GetNumberOfArrays(); ++i )
{
GEOS_LOG_RANK_0( GEOS_FMT( " - {} of type '{}'",
faceMesh->GetPointData()->GetArrayName( i ),
faceMesh->GetPointData()->GetArray( i )->GetDataTypeAsString() ) );
}
GEOS_ERROR( GEOS_FMT( "Could not find valid field \"{}\" for fracture \"{}\"{}.",
COLLOCATED_NODES,
faceBlockName,
performGlobalCheck ? "" : " on this rank" ) );
}
else
init( collocatedNodes );
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Options: [use_defaults, log_level, seed, coupling]
// PARMETIS_PSR_UNCOUPLED = 0 (default - uses PartitionSmallGraph)
// PARMETIS_PSR_COUPLED = 1 (forces distributed algorithm)
idx_t options[4] = { 1, 0, 2022, 0 };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
idx_t options[4] = { 1, 0, 2022, 0 };
idx_t options[4] = { 1, 0, 2022, PARMETIS_PSR_UNCOUPLED };

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes


// GetCellAtId() is conditionally thread-safe, use POLICY argument
forAll< POLICY >( num3dCells, [&cells, &globalPointId, elemToNodes = elemToNodes.toView()] ( localIndex const cellIdx )
forAll< POLICY >( numCells, [&cells, &globalPointId, elemToNodes = elemToNodes.toView()] ( localIndex const cellIdx )
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might the need for passing cells through the interface and the IsShareable logic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that part was refactored.
@jafranc, could you review the changes and confirm that they correctly addressed your comments.

Comment on lines +760 to +772
int myEdgeCount = static_cast< int >( myEdgesSrc.size() );

// Gather counts from all ranks
array1d< int > allEdgeCounts( numRanks );
MpiWrapper::allgather( &myEdgeCount, 1, allEdgeCounts.data(), 1, comm );

// Compute displacements for symmetrization
array1d< int > edgeDispls( numRanks + 1 );
edgeDispls[0] = 0;
for( int r = 0; r < numRanks; ++r )
{
edgeDispls[r + 1] = edgeDispls[r] + allEdgeCounts[r];
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int myEdgeCount = static_cast< int >( myEdgesSrc.size() );
// Gather counts from all ranks
array1d< int > allEdgeCounts( numRanks );
MpiWrapper::allgather( &myEdgeCount, 1, allEdgeCounts.data(), 1, comm );
// Compute displacements for symmetrization
array1d< int > edgeDispls( numRanks + 1 );
edgeDispls[0] = 0;
for( int r = 0; r < numRanks; ++r )
{
edgeDispls[r + 1] = edgeDispls[r] + allEdgeCounts[r];
}
int myEdgeCount = static_cast< int >( myEdgesSrc.size() );
// Gather counts from all ranks
array1d< int > allEdgeCounts( numRanks );
MpiWrapper::allgather( &myEdgeCount, 1, allEdgeCounts.data(), 1, comm );
// Compute displacements for symmetrization
array1d< int > edgeDispls( numRanks + 1 );
edgeDispls[0] = 0;
std::partial_sum(allEdgeCounts.begin(), allEdgeCounts.end(),
edgeDispls.begin() + 1);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Check 6: Graph symmetry for local edges
// -----------------------------------------------------------------------
{
std::unordered_set< uint64_t > localOutgoingNeighbors;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we refactor this duplication into a single locally declared function ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplication removed

// SECTION 5: PARTITION UNPACKING (after ParMETIS)
// =============================================================================================
array1d< int64_t >
unpackSuperCellPartitioning(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as packing/unpacking is rather specific in GEOS context, unless related, I would suggest renaming

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed expandSuperCellPartitioningToCells

// -----------------------------------------------------------------------
// Step 1: Build 3D cell connectivity graph via fractures
// -----------------------------------------------------------------------
std::map< vtkIdType, std::set< vtkIdType > > fractureGraph;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the use of stdMap rather than std::map a directive ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, changed all of them in favor of StdContainerWrappers

Comment on lines +1007 to +1048
vtkIdType const numCells = cells3D->GetNumberOfCells();
array1d< int64_t > cellPartitioning( numCells );

for( vtkIdType i = 0; i < numCells; ++i )
{
vtkIdType superCellId = superCellIdArray->GetValue( i );
auto it = superCellIdToLocalIdx.find( superCellId );

GEOS_ERROR_IF( it == superCellIdToLocalIdx.end(),
GEOS_FMT( "Rank {}: Cell {} has unknown super-cell ID {}",
rank, i, superCellId ) );

cellPartitioning[i] = superPartitioning[it->second];
}

// -----------------------------------------------------------------------
// Step 3: Validate that super-cells weren't split
// -----------------------------------------------------------------------
stdMap< vtkIdType, std::set< int64_t > > superCellToRanks;

for( vtkIdType i = 0; i < numCells; ++i )
{
vtkIdType const scId = superCellIdArray->GetValue( i );
superCellToRanks.get_inserted( scId ).insert( cellPartitioning[i] );
}

vtkIdType numSplitSuperCells = 0;
for( auto const & [superCellId, ranks] : superCellToRanks )
{
if( ranks.size() > 1 )
{
++numSplitSuperCells;
}
}

vtkIdType totalSplitSuperCells = MpiWrapper::sum( numSplitSuperCells, comm );

GEOS_ERROR_IF( totalSplitSuperCells > 0,
GEOS_FMT( "Partitioning failed: {:L} super-cells were split across ranks!",
totalSplitSuperCells ) );

return cellPartitioning;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we group these do-check architecture into a single loop leveraging set's lookup insertion idiom (like above) and disjunction on the second value of the tuple ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci: run code coverage enables running of the code coverage CI jobs ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run integrated tests Allows to run the integrated tests in GEOS CI type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants