use current node load estimation when placing search jobs by trinity-1686a · Pull Request #6390 · quickwit-oss/quickwit

trinity-1686a · 2026-05-06T10:08:43Z

When placing jobs, first query all nodes for their current load, and bias placement toward less loaded nodes to even load on the cluster
this improves on a problem where some nodes might be overloaded while other are underloaded, causing queueing despite not all nodes being at max capacity
in testing, this was seen improving slightly p50+ under constant light load, and increased the max qps before latency explodes. Metrics also showed all searcher busy when some would be only part-time working before

future improvements:

we could debounce calls to GetLoad
fetch_docs could reuse the same searcher as used in the leaf_search phase, this guarantees a footer-cache hit and respect pre-existing load without the need for more GetLoad calls on the critical path

PSeitz · 2026-05-07T14:20:15Z

-fn compute_split_cost(split_metadata: &SplitMetadata) -> usize {
+pub(crate) fn compute_split_cost(num_docs: u64) -> usize {
    // TODO this formula could be tuned a lot more. The general idea is that there is a fixed
    // cost to searching a split, plus a somewhat-linear cost depending on the size of the split


this should include if we have aggregations or not, or ideally a cost from the query

i'm not sure how to factor that in exactly, we don't have a good cost model for aggregations.

A simple count() group by X on a low cardinality field isn't usually very expensive, but a cardinality(Y) group by Z with both being high cardinality is much more expensive, though i have no idea of by how-much. And this also depends on the selectivity of the query

We would probably have a very rough estimate on the query AST. Can you add the TODO to eventually track aggregation cost and estimate query selectivity?

PSeitz · 2026-05-11T11:06:13Z

+            // Seed each candidate node with its current load so the placer avoids
+            // routing work to already-loaded nodes. If a node fails to report its
+            // load (error or timeout), `load` stays `None`: we still route work
+            // there if all other nodes are overloaded, but we prefer reachable


maybe we should error instead if all nodes are not reachable or overloaded

Platane · 2026-05-22T09:59:31Z

    }
+
+    async fn get_load(&self) -> usize {
+        self.searcher_context.search_permit_provider.get_load()


would it be useful to add a metrics to track the load per node ?

PSeitz · 2026-05-22T11:59:08Z

+            const GET_LOAD_TIMEOUT: Duration = Duration::from_millis(200);
+            let load_futures = candidate_nodes.iter_mut().map(|node| {
+                let mut client = node.client.clone();
+                async move { tokio::time::timeout(GET_LOAD_TIMEOUT, client.get_load()).await }


I wonder if we should have a 1 sec cache for the load to reduce traffic

PSeitz · 2026-05-22T13:06:36Z


        let total_load: usize = jobs.iter().map(|job| job.cost()).sum();

+        // Compute `target_load` using only reachable nodes (those with a known


Braindump:

I think there's a risk involved with regards to caching.
Assuming node A has the highest affinity for split X
=> We can assume node A has caches filled for split X.

Scenario: high load.
node A = 110, node B = 100.
=> We end up putting queries for split X split on node B. This has a higher cost because node B needs to fill the caches for split X (2x query cost?). The formula we use always adds a constant 5 to fill caches.
This may cause cache evictions on node B, which increases the real cost more.
=> higher total load on the cluster

The query will be longer on node B, which will report the cost for longer, so it has self-correcting properties.

We can expect for some splits to be more queried often, because they are newer, so we want more nodes on them (for some time) and this is in the right direction.

trinity-1686a added 8 commits May 5, 2026 17:03

estimate current node load

8e6c3c6

expose node load as grpc endpoint

a5ef779

take existing load into account when placing jobs

fc11d4c

ignore node with very high load when computing target load

186765b

handle unimplemented get_load

27b408d

improve test

f32d6b5

use atomic instead of actor to maintain/query pending job cost

c71136e

don't run a GetLoad call for fetch docs jobs

2b01d92

PSeitz reviewed May 7, 2026

View reviewed changes

PSeitz reviewed May 11, 2026

View reviewed changes

Platane reviewed May 22, 2026

View reviewed changes

PSeitz reviewed May 22, 2026

View reviewed changes

Merge branch 'main' into trinity.pointard/placer-consider-load

03cead4

trinity-1686a requested a review from a team as a code owner May 22, 2026 13:19

complete TODO on compute_split_cost

c16bf3b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use current node load estimation when placing search jobs#6390

use current node load estimation when placing search jobs#6390
trinity-1686a wants to merge 10 commits into
mainfrom
trinity.pointard/placer-consider-load

trinity-1686a commented May 6, 2026 •

edited

Loading

Uh oh!

PSeitz May 7, 2026

Uh oh!

trinity-1686a May 22, 2026

Uh oh!

PSeitz-dd May 22, 2026

Uh oh!

PSeitz May 11, 2026 •

edited

Loading

Uh oh!

Platane May 22, 2026

Uh oh!

PSeitz May 22, 2026

Uh oh!

PSeitz May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		let total_load: usize = jobs.iter().map(\|job\| job.cost()).sum();

		// Compute `target_load` using only reachable nodes (those with a known

Conversation

trinity-1686a commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PSeitz May 7, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a May 22, 2026

Choose a reason for hiding this comment

Uh oh!

PSeitz-dd May 22, 2026

Choose a reason for hiding this comment

Uh oh!

PSeitz May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Platane May 22, 2026

Choose a reason for hiding this comment

Uh oh!

PSeitz May 22, 2026

Choose a reason for hiding this comment

Uh oh!

PSeitz May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

trinity-1686a commented May 6, 2026 •

edited

Loading

PSeitz May 11, 2026 •

edited

Loading