Skip to content

Metrics failing for low volume clusters #800

@geekmush

Description

@geekmush

We have deployed multiple clusters with chart 0.27.0 and noticed some of them were failing to populate the Grafana dashboard. After much digging, we were finding some DB pods returning a 500 http result instead of the expected 200. This was due to the low volume (or not used at all yet), so the query was returning results for db:postgres and db:app and causing a duplicate result collision and freaking out the metrics exporter.

Further digging found:

The pg_stat_statements_top query defined in the Helm chart's monitoringQueriesConfigMap (values.yaml) causes HTTP 500 errors on the metrics endpoint due to duplicate Prometheus label combinations.

charts/cloudnative-pg/values.yaml → monitoringQueriesConfigMap.queries → pg_stat_statements_top

Default values query:

pg_stat_statements_top:
  query: |
    SELECT
      current_database() AS datname,
      queryid::text AS queryid,
      LEFT(query, 100) AS query,
      calls,
      total_exec_time / 1000.0 AS total_time_seconds,
      mean_exec_time / 1000.0 AS mean_time_seconds,
      rows
    FROM pg_stat_statements
    WHERE query NOT LIKE '%pg_stat_statements%'
    ORDER BY total_exec_time DESC
    LIMIT 20
  target_databases:
    - '*'

Two issues combine to cause duplicate Prometheus labels:

1 - Missing dbid filter: With target_databases: ['*'], the query runs against all databases. Since pg_stat_statements is cluster-wide, the same rows are returned regardless of which database is queried, causing duplicate metrics with the same queryid but different datname values.

2 - Missing aggregation: In PostgreSQL, pg_stat_statements uses composite key (userid, dbid, queryid, toplevel). The same query executed by different users creates separate rows with identical queryid. Without GROUP BY, these produce duplicate label combinations.

What we did to fix was change the query in the values file to:

pg_stat_statements_top:
  query: |
    SELECT
      current_database() AS datname,
      queryid::text AS queryid,
      LEFT(query, 100) AS query,
      SUM(calls)::bigint AS calls,
      SUM(total_exec_time) / 1000.0 AS total_time_seconds,
      SUM(total_exec_time) / NULLIF(SUM(calls), 0) / 1000.0 AS mean_time_seconds,
      SUM(rows)::bigint AS rows
    FROM pg_stat_statements
    WHERE dbid = (SELECT oid FROM pg_database WHERE datname = current_database())
      AND query NOT LIKE '%pg_stat_statements%'
    GROUP BY queryid, query
    ORDER BY SUM(total_exec_time) DESC
    LIMIT 20
  target_databases:
    - '*'

Summary of Changes:

  • Added dbid filter to scope results to the current database
  • Added GROUP BY queryid, query with SUM() aggregation for multi-user queries
  • Fixed mean calculation using SUM(total_exec_time) / NULLIF(SUM(calls), 0)
  • Added ::bigint casts to maintain consistent types

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingchart( cluster )Related to the cluster chart

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions