Skip to content

CCM-14044 Setting prod defaults#439

Open
aidenvaines-cgi wants to merge 9 commits intomainfrom
CCM-14044_AddingAnomalyAlarm
Open

CCM-14044 Setting prod defaults#439
aidenvaines-cgi wants to merge 9 commits intomainfrom
CCM-14044_AddingAnomalyAlarm

Conversation

@aidenvaines-cgi
Copy link
Contributor

Description

Adding Anomaly alarms for Event Subscriptions
Bumping EventPub module to add Anomaly alarms for Event Publishing

Context

Type of changes

  • Refactoring (non-breaking change)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would change existing functionality)
  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I am familiar with the contributing guidelines
  • I have followed the code style of the project
  • I have added tests to cover my changes
  • I have updated the documentation accordingly
  • This PR is a result of pair or mob programming
  • If I have used the 'skip-trivy-package' label I have done so responsibly and in the knowledge that this is being fixed as part of a separate ticket/PR.

Sensitive Information Declaration

To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.

  • I confirm that neither PII/PID nor sensitive data are included in this PR and the codebase changes.

@aidenvaines-cgi aidenvaines-cgi requested a review from a team as a code owner March 3, 2026 16:21
sidnhs
sidnhs previously approved these changes Mar 4, 2026
variable "event_publishing_anomaly_band_width" {
type = number
description = "The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4."
default = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5 might be too high? From what I understand it's based on standard deviation. 5 standard deviations is like 99.9999% of all data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought that too, I've re-run the numbers for prod and 4 is probably better in production
nhs-main-supapi-eventpub
Total messages: 554124
Average/hour: 4617.70
Max/hour: 57772
Min/hour: 1
Hours with traffic: 120 / 120
Coef Variability (CV): 181.00%

I've changed it to 4, thats probably better for whats there now


alarm_name = "${local.csi}-subscriber-anomaly"
alarm_description = "ANOMALY: Detects anomalous patterns in messages published to the SNS fanout topic"
comparison_operator = "LessThanLowerOrGreaterThanUpperThreshold"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About LessThanLower - do we want to alert on drops too? I think there'll be periods when not in use?

Copy link
Contributor Author

@aidenvaines-cgi aidenvaines-cgi Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe the way the threshold should work would be fine. we do want alerts when we get messages just unespectedly vanish. Something being updated somewhere and events just stopping is the case im most worried with

Comment on lines +109 to +112
validation {
condition = var.event_anomaly_band_width >= 2 && var.event_anomaly_band_width <= 10
error_message = "Band width must be between 2 and 10"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed here? Is the anomaly threshold particularly important here? Again, I think 5 is already big

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not especially, but ive tried to keep this consistent across the bounded contexts. probably more of a thing for the publisher really.
ageed on the 5 being a bit high

@aidenvaines-cgi aidenvaines-cgi enabled auto-merge (squash) March 5, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants