Conversation
| variable "event_publishing_anomaly_band_width" { | ||
| type = number | ||
| description = "The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4." | ||
| default = 5 |
There was a problem hiding this comment.
I think 5 might be too high? From what I understand it's based on standard deviation. 5 standard deviations is like 99.9999% of all data
There was a problem hiding this comment.
i thought that too, I've re-run the numbers for prod and 4 is probably better in production
nhs-main-supapi-eventpub
Total messages: 554124
Average/hour: 4617.70
Max/hour: 57772
Min/hour: 1
Hours with traffic: 120 / 120
Coef Variability (CV): 181.00%
I've changed it to 4, thats probably better for whats there now
|
|
||
| alarm_name = "${local.csi}-subscriber-anomaly" | ||
| alarm_description = "ANOMALY: Detects anomalous patterns in messages published to the SNS fanout topic" | ||
| comparison_operator = "LessThanLowerOrGreaterThanUpperThreshold" |
There was a problem hiding this comment.
About LessThanLower - do we want to alert on drops too? I think there'll be periods when not in use?
There was a problem hiding this comment.
i believe the way the threshold should work would be fine. we do want alerts when we get messages just unespectedly vanish. Something being updated somewhere and events just stopping is the case im most worried with
infrastructure/terraform/modules/eventsub/cloudwatch_metric_alarm_subscriber_anomaly.tf
Outdated
Show resolved
Hide resolved
infrastructure/terraform/modules/eventsub/cloudwatch_metric_alarm_subscriber_anomaly.tf
Outdated
Show resolved
Hide resolved
| validation { | ||
| condition = var.event_anomaly_band_width >= 2 && var.event_anomaly_band_width <= 10 | ||
| error_message = "Band width must be between 2 and 10" | ||
| } |
There was a problem hiding this comment.
Needed here? Is the anomaly threshold particularly important here? Again, I think 5 is already big
There was a problem hiding this comment.
maybe not especially, but ive tried to keep this consistent across the bounded contexts. probably more of a thing for the publisher really.
ageed on the 5 being a bit high
Description
Adding Anomaly alarms for Event Subscriptions
Bumping EventPub module to add Anomaly alarms for Event Publishing
Context
Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.