Skip to content

HYPERFLEET-1238 - fix: paginate API responses to fetch all resources#185

Open
rafabene wants to merge 2 commits into
openshift-hyperfleet:mainfrom
rafabene:HYPERFLEET-1238-paginate-api-results
Open

HYPERFLEET-1238 - fix: paginate API responses to fetch all resources#185
rafabene wants to merge 2 commits into
openshift-hyperfleet:mainfrom
rafabene:HYPERFLEET-1238-paginate-api-results

Conversation

@rafabene

Copy link
Copy Markdown
Contributor

Summary

  • Bug: Sentinel only fetched the first page (20 resources) from the API, leaving clusters/nodepools beyond page 1 unreconciled
  • Fix: fetchClusters() and fetchNodePools() now iterate through all pages using page/pageSize query parameters, collecting every resource before returning
  • Refactor: Extracted repeated test string literals into named constants and helper functions (createMockCondition, createMockNodePool) to resolve all goconst lint warnings

Acceptance Criteria

  • Sentinel iterates through all pages of the API response when polling for resources
  • All clusters are evaluated regardless of page position
  • E2E test covers environments with more than one page of resources — Covered by unit tests: TestFetchResources_Pagination exercises 54 clusters across 3 pages against a real HTTP mock server, validating the full pagination loop. The pagination logic is entirely within the client layer, making E2E unnecessary for this fix.

Test Plan

  • TestFetchResources_Pagination — 54 clusters across 3 pages, verifies all resources collected and correct page requests made
  • TestFetchResources_PaginationSinglePage — verifies no extra requests when total fits in one page
  • TestFetchResources_PaginationErrorOnSecondPage — verifies errors on subsequent pages propagate correctly
  • TestFetchResources_PaginationSendsPageSize — verifies pageSize query parameter is sent
  • All existing tests pass (make test-unit)
  • make lint — 0 issues

@openshift-ci openshift-ci Bot requested review from aredenba-rh and mliptak0 June 19, 2026 16:33
@openshift-ci

openshift-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign aredenba-rh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: efe3bfa8-5b3e-426c-8105-277552e64ad0

📥 Commits

Reviewing files that changed from the base of the PR and between 19c5ad2 and 5059e6a.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • internal/client/client.go
  • internal/client/client_test.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/client/client.go
  • internal/client/client_test.go

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Fixed API response pagination to fetch all pages, resolving missing reconciliation events in environments with more than 20 clusters or nodepools.
  • Tests

    • Expanded pagination test coverage for resource fetching across multiple API pages.

Walkthrough

The PR fixes a bug where Sentinel only fetched the first page (20 items) of clusters and nodepools from the HyperFleet API. fetchClusters and fetchNodePools are refactored into paginated loops using Page/PageSize query parameters, accumulating results until the API-reported Total is reached or a page returns empty. Error handling is expanded to return APIError with retry metadata for timeouts, nil response bodies, and HTTP status codes. Inline conversion logic is extracted into convertCluster and convertNodePool helpers. An exported DefaultPageSize int32 = 20 constant is introduced. Tests are updated to use shared constants and four new pagination-specific test cases are added.

Sequence Diagram(s)

sequenceDiagram
  participant FetchResources
  participant HyperFleetAPI as HyperFleet API
  participant fetchClusters
  participant convertCluster as convertCluster Helper
  
  FetchResources->>fetchClusters: call with Page=0, PageSize=20
  fetchClusters->>HyperFleetAPI: GET /clusters?page=0&size=20
  HyperFleetAPI-->>fetchClusters: {items:[...], total:65, page:0, size:20}
  fetchClusters->>convertCluster: convert 20 items
  convertCluster-->>fetchClusters: [Resource...]
  fetchClusters->>HyperFleetAPI: GET /clusters?page=1&size=20
  HyperFleetAPI-->>fetchClusters: {items:[...], total:65, page:1, size:20}
  fetchClusters->>convertCluster: convert 20 items
  convertCluster-->>fetchClusters: [Resource...]
  fetchClusters->>HyperFleetAPI: GET /clusters?page=2&size=20
  HyperFleetAPI-->>fetchClusters: {items:[...], total:65, page:2, size:20}
  fetchClusters->>convertCluster: convert 25 items (final page)
  convertCluster-->>fetchClusters: [Resource...]
  fetchClusters-->>FetchResources: accumulated 65 Resources
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
No Injection Vectors ⚠️ Warning CWE-89 query language injection: Label selector keys in labelSelectorToSearchString() are concatenated directly into TSL queries without escaping, allowing injection via user-controlled YAML config. Escape label selector keys with TSL-appropriate escaping (e.g., quote/backslash handling) before fmt.Sprintf, matching the escaping applied to values.
No Pii Or Sensitive Data In Logs ⚠️ Warning New debug logging at lines 171, 175 logs APIError objects containing raw error messages with embedded URLs. If the API endpoint URL contains embedded credentials (http://user:password@host), networ... Redact sensitive URL info: use url.Error.URL.Hostname() or extract non-credential parts, or log generic "network error" without the full error object's URL string representation.
✅ Passed checks (9 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: implementing pagination in API response fetching to retrieve all resources instead of just the first page.
Description check ✅ Passed The description clearly details the bug, fix, refactoring, and test coverage directly related to the pagination implementation in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 89.47% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Sec-02: Secrets In Log Output ✅ Passed All log statements in non-test files (client.go) use safe data: pagination metrics (page, size, total), resourceType strings, and sanitized error messages (HTTP status codes, timeout status, networ...
No Hardcoded Secrets ✅ Passed No hardcoded secrets detected. All test constants (timestamps, paths, emails, filter strings) are legitimate fixtures. No API keys, tokens, passwords, private keys, or credential-like variables fou...
No Weak Cryptography ✅ Passed No weak cryptography detected. PR changes pagination logic without introducing crypto/md5, crypto/des, crypto/rc4, SHA1, custom crypto, or unsafe secret comparisons.
No Privileged Containers ✅ Passed PR modifies only Go application code (client.go, client_test.go, CHANGELOG.md) implementing pagination logic. No Kubernetes manifests, Helm templates, or Dockerfiles are modified; check is not appl...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Comment @coderabbitai help to get the list of available commands and usage tips.

@hyperfleet-ci-bot

hyperfleet-ci-bot Bot commented Jun 19, 2026

Copy link
Copy Markdown

Risk Score: 2 — risk/medium

Signal Detail Points
PR size 800 lines (>500) +2
Sensitive paths none +0
Test coverage Tests cover changed packages +0

Computed by hyperfleet-risk-scorer

@rafabene rafabene force-pushed the HYPERFLEET-1238-paginate-api-results branch from 19c5ad2 to f93ef03 Compare June 19, 2026 16:39
…nings

Replace repeated string literals in client test fixtures with named
constants and helper functions (createMockCondition, createMockNodePool).
@rafabene rafabene force-pushed the HYPERFLEET-1238-paginate-api-results branch from f93ef03 to 80e704c Compare June 19, 2026 16:39

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/client/client_test.go (1)

471-505: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

TestFetchResources_NilStatus is not exercising a nil-status path (CWE-398).

Line [475] creates only a valid cluster, so this test currently validates a happy path while claiming degradation behavior for nil status.

Proposed test-fix diff
 func TestFetchResources_NilStatus(t *testing.T) {
 	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		// Note: With v1.0.0 spec, status is required, so this test will fail validation
-		// This test might need to be removed or modified since status is now required
-		cluster2 := createMockCluster("cluster-2")
+		cluster1 := createMockCluster("cluster-1")
+		delete(cluster1, keyStatus) // malformed payload to exercise graceful degradation
+		cluster2 := createMockCluster("cluster-2")
 		response := createMockClusterList([]map[string]interface{}{
+			cluster1,
 			cluster2,
 		})
@@
-	// Should skip cluster-1 with nil status, return only cluster-2
+	// Should skip cluster-1 with nil status, return only cluster-2
 	if len(resources) != 1 {
 		t.Fatalf("Expected 1 resource (nil status skipped), got %d", len(resources))
 	}

As per coding guidelines, “Error paths SHOULD be tested, not just happy paths.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/client/client_test.go` around lines 471 - 505, The
TestFetchResources_NilStatus test is not actually exercising the nil-status path
it claims to test. The test currently creates only a valid cluster using
createMockCluster("cluster-2"), which has a valid status by default. To properly
test graceful degradation, modify the mock response to include two clusters: one
with a valid status and one with an explicitly nil status field. This way the
test will verify that the FetchResources method correctly skips the cluster with
nil status while returning the valid cluster, matching the test's assertion that
only 1 resource should be returned.

Source: Coding guidelines

🧹 Nitpick comments (1)
internal/client/client_test.go (1)

1132-1275: ⚡ Quick win

Pagination coverage is cluster-only; nodepool pagination path is unverified (CWE-398).

These new tests validate only ResourceTypeClusters, but the same PR changes pagination behavior for nodepools too (fetchNodePools loop + page/pageSize contract). Add a mirrored nodepool pagination case (ideally table-driven by resource type) to prevent one-sided regressions.

As per coding guidelines, “New exported functions and critical logic paths SHOULD have tests” and “Table-driven tests with t.Run() for repeated patterns.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/client/client_test.go` around lines 1132 - 1275, The pagination
tests currently only validate ResourceTypeClusters, but the code changes also
affect nodepool pagination with the same page/pageSize contract. Refactor the
existing pagination tests (TestFetchResources_Pagination,
TestFetchResources_PaginationSinglePage,
TestFetchResources_PaginationErrorOnSecondPage, and
TestFetchResources_PaginationSendsPageSize) to use table-driven test patterns
with t.Run() that cover both ResourceTypeClusters and ResourceTypeNodePools to
ensure pagination behavior is properly verified for all affected resource types
and prevent one-sided regressions.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/client/client_test.go`:
- Around line 1218-1248: The test function
TestFetchResources_PaginationErrorOnSecondPage verifies that an error is
returned when pagination fails on the second page, but it does not validate that
no partial results are included in the response. After the FetchResources call
returns with an error, add an assertion to verify that the resources value is
nil or empty, ensuring that partial data is not returned alongside the error.
This hardening prevents the test from passing even if the implementation
incorrectly returns partial results on pagination failure.

In `@internal/client/client.go`:
- Around line 295-309: The code currently marks all network errors as retriable,
but context.Canceled and context.DeadlineExceeded errors should not be retriable
as they indicate intentional shutdown or timeout paths. In the error handling
blocks around lines 295-309 and 423-437, add explicit checks for
context.Canceled and context.DeadlineExceeded using errors.Is before the
existing url.Error checks. For these context errors, return an APIError with
Retriable set to false to prevent retries during shutdown or deadline scenarios.
Keep the existing logic for actual network timeouts as retriable.
- Around line 332-334: The code uses resourceList.Total directly as a capacity
hint in make() calls without validation, which can cause panics if the value is
negative or excessive memory allocation if extremely large. Add validation to
check that resourceList.Total is non-negative and within acceptable bounds
before using it for slice capacity in the make() statements. Apply this
validation at all locations where resourceList.Total is used for slice
allocation: the initial allResources assignment at line 333, the additional
assignments at lines 343-344, and the similar patterns at lines 460-462 and
471-472. If the Total value fails validation, use a safe default capacity (such
as 0) instead of the untrusted value.
- Around line 511-523: Add a nil guard before dereferencing ownerRef since it is
assigned from item.OwnerReferences which is a pointer type that could be nil
from an API response. Wrap the entire block that accesses ownerRef.Id,
ownerRef.Href, and ownerRef.Kind (the three conditional checks that dereference
the pointer fields) inside an if ownerRef != nil check to prevent a panic on
null pointer dereference.

---

Outside diff comments:
In `@internal/client/client_test.go`:
- Around line 471-505: The TestFetchResources_NilStatus test is not actually
exercising the nil-status path it claims to test. The test currently creates
only a valid cluster using createMockCluster("cluster-2"), which has a valid
status by default. To properly test graceful degradation, modify the mock
response to include two clusters: one with a valid status and one with an
explicitly nil status field. This way the test will verify that the
FetchResources method correctly skips the cluster with nil status while
returning the valid cluster, matching the test's assertion that only 1 resource
should be returned.

---

Nitpick comments:
In `@internal/client/client_test.go`:
- Around line 1132-1275: The pagination tests currently only validate
ResourceTypeClusters, but the code changes also affect nodepool pagination with
the same page/pageSize contract. Refactor the existing pagination tests
(TestFetchResources_Pagination, TestFetchResources_PaginationSinglePage,
TestFetchResources_PaginationErrorOnSecondPage, and
TestFetchResources_PaginationSendsPageSize) to use table-driven test patterns
with t.Run() that cover both ResourceTypeClusters and ResourceTypeNodePools to
ensure pagination behavior is properly verified for all affected resource types
and prevent one-sided regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2304dd11-31c7-463a-bc54-e80a70835099

📥 Commits

Reviewing files that changed from the base of the PR and between 9aee48e and 19c5ad2.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • internal/client/client.go
  • internal/client/client_test.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)

Comment thread internal/client/client_test.go
Comment thread internal/client/client.go
Comment thread internal/client/client.go
Comment thread internal/client/client.go
@rafabene rafabene force-pushed the HYPERFLEET-1238-paginate-api-results branch from 80e704c to 7fe0dd1 Compare June 19, 2026 16:43
Sentinel was only fetching the first page of API results (default
size 20), leaving clusters beyond that window unreconciled. Now
iterates through all pages using the page/pageSize query parameters
and the total field from the response.
@rafabene rafabene force-pushed the HYPERFLEET-1238-paginate-api-results branch from 7fe0dd1 to 5059e6a Compare June 19, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant