Skip to content

SRE - Weekly report #1755

@tales-aparecida

Description

@tales-aparecida

Summarize usage to indicate how well the Dashboard is behaving.

Basically take https://mon.kernelci.org/public-dashboards/715f7faddb014b0e99fd025f4ae19a7a?from=now-1h&to=now&timezone=browser&var-summary_mode=range and format as an email to the Working Group.

If possible we'd like to see at least "Requests Count" grouped by "Response Status Code", which should give us a rough idea of failed requests, including timeouts.

Ideally we should get the uptime, to see how often the server crashed or otherwise was offline during releases deployments. That might require a periodical prometheus exporter hitting the "status" endpoint.

If this can be achieved directly by Grafana or Prometheus, it is an acceptable solution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BackendMost or all of the changes for this issue will be in the backend code.MetricsRelated to open metrics, measurements or usage dataenhancementNew feature or request

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions