Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions infrastructure/terraform/components/api/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,13 @@ No requirements.
| <a name="input_enable_alarms"></a> [enable\_alarms](#input\_enable\_alarms) | Enable CloudWatch alarms for this deployed environment | `bool` | `true` | no |
| <a name="input_enable_api_data_trace"></a> [enable\_api\_data\_trace](#input\_enable\_api\_data\_trace) | Enable API Gateway data trace logging | `bool` | `false` | no |
| <a name="input_enable_backups"></a> [enable\_backups](#input\_enable\_backups) | Enable backups | `bool` | `false` | no |
| <a name="input_enable_event_anomaly_detection"></a> [enable\_event\_anomaly\_detection](#input\_enable\_event\_anomaly\_detection) | Enable CloudWatch anomaly detection alarm for SNS message Detects abnormal drops or spikes in event publishing volume. | `bool` | `true` | no |
| <a name="input_enable_event_cache"></a> [enable\_event\_cache](#input\_enable\_event\_cache) | Enable caching of events to an S3 bucket | `bool` | `true` | no |
| <a name="input_enable_sns_delivery_logging"></a> [enable\_sns\_delivery\_logging](#input\_enable\_sns\_delivery\_logging) | Enable SNS Delivery Failure Notifications | `bool` | `true` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | The name of the tfscaffold environment | `string` | n/a | yes |
| <a name="input_event_anomaly_band_width"></a> [event\_anomaly\_band\_width](#input\_event\_anomaly\_band\_width) | The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4. | `number` | `4` | no |
| <a name="input_event_anomaly_evaluation_periods"></a> [event\_anomaly\_evaluation\_periods](#input\_event\_anomaly\_evaluation\_periods) | Number of evaluation periods for the anomaly alarm. Each period is defined by event\_anomaly\_period. | `number` | `3` | no |
| <a name="input_event_anomaly_period"></a> [event\_anomaly\_period](#input\_event\_anomaly\_period) | The period in seconds over which the specified statistic is applied for anomaly detection. Minimum 300 seconds (5 minutes). Recommended: 300-600. | `number` | `300` | no |
| <a name="input_eventpub_control_plane_bus_arn"></a> [eventpub\_control\_plane\_bus\_arn](#input\_eventpub\_control\_plane\_bus\_arn) | ARN of the EventBridge control plane bus for eventpub | `string` | `""` | no |
| <a name="input_eventpub_data_plane_bus_arn"></a> [eventpub\_data\_plane\_bus\_arn](#input\_eventpub\_data\_plane\_bus\_arn) | ARN of the EventBridge data plane bus for eventpub | `string` | `""` | no |
| <a name="input_force_destroy"></a> [force\_destroy](#input\_force\_destroy) | Flag to force deletion of S3 buckets | `bool` | `false` | no |
Expand All @@ -45,31 +49,31 @@ No requirements.
| Name | Source | Version |
|------|--------|---------|
| <a name="module_amendment_event_transformer"></a> [amendment\_event\_transformer](#module\_amendment\_event\_transformer) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_amendments_queue"></a> [amendments\_queue](#module\_amendments\_queue) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.24/terraform-sqs.zip | n/a |
| <a name="module_amendments_queue"></a> [amendments\_queue](#module\_amendments\_queue) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip | n/a |
| <a name="module_authorizer_lambda"></a> [authorizer\_lambda](#module\_authorizer\_lambda) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_ddb_alarms_letter_queue"></a> [ddb\_alarms\_letter\_queue](#module\_ddb\_alarms\_letter\_queue) | ../../modules/alarms-ddb | n/a |
| <a name="module_ddb_alarms_letters"></a> [ddb\_alarms\_letters](#module\_ddb\_alarms\_letters) | ../../modules/alarms-ddb | n/a |
| <a name="module_ddb_alarms_mi"></a> [ddb\_alarms\_mi](#module\_ddb\_alarms\_mi) | ../../modules/alarms-ddb | n/a |
| <a name="module_ddb_alarms_suppliers"></a> [ddb\_alarms\_suppliers](#module\_ddb\_alarms\_suppliers) | ../../modules/alarms-ddb | n/a |
| <a name="module_domain_truststore"></a> [domain\_truststore](#module\_domain\_truststore) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-s3bucket.zip | n/a |
| <a name="module_eventpub"></a> [eventpub](#module\_eventpub) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-eventpub.zip | n/a |
| <a name="module_domain_truststore"></a> [domain\_truststore](#module\_domain\_truststore) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-s3bucket.zip | n/a |
| <a name="module_eventpub"></a> [eventpub](#module\_eventpub) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-eventpub.zip | n/a |
| <a name="module_eventsub"></a> [eventsub](#module\_eventsub) | ../../modules/eventsub | n/a |
| <a name="module_get_letter"></a> [get\_letter](#module\_get\_letter) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_get_letter_data"></a> [get\_letter\_data](#module\_get\_letter\_data) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_get_letters"></a> [get\_letters](#module\_get\_letters) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_get_status"></a> [get\_status](#module\_get\_status) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_kms"></a> [kms](#module\_kms) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-kms.zip | n/a |
| <a name="module_lambda_alarms"></a> [lambda\_alarms](#module\_lambda\_alarms) | ../../modules/alarms-lambda | n/a |
| <a name="module_letter_status_updates_queue"></a> [letter\_status\_updates\_queue](#module\_letter\_status\_updates\_queue) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.24/terraform-sqs.zip | n/a |
| <a name="module_letter_status_updates_queue"></a> [letter\_status\_updates\_queue](#module\_letter\_status\_updates\_queue) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip | n/a |
| <a name="module_letter_updates_transformer"></a> [letter\_updates\_transformer](#module\_letter\_updates\_transformer) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_mi_updates_transformer"></a> [mi\_updates\_transformer](#module\_mi\_updates\_transformer) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-lambda.zip | n/a |
| <a name="module_patch_letter"></a> [patch\_letter](#module\_patch\_letter) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_post_letters"></a> [post\_letters](#module\_post\_letters) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_post_mi"></a> [post\_mi](#module\_post\_mi) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_s3bucket_test_letters"></a> [s3bucket\_test\_letters](#module\_s3bucket\_test\_letters) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-s3bucket.zip | n/a |
| <a name="module_sqs_alarms"></a> [sqs\_alarms](#module\_sqs\_alarms) | ../../modules/alarms-sqs | n/a |
| <a name="module_sqs_letter_updates"></a> [sqs\_letter\_updates](#module\_sqs\_letter\_updates) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-sqs.zip | n/a |
| <a name="module_sqs_supplier_allocator"></a> [sqs\_supplier\_allocator](#module\_sqs\_supplier\_allocator) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-sqs.zip | n/a |
| <a name="module_sqs_letter_updates"></a> [sqs\_letter\_updates](#module\_sqs\_letter\_updates) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip | n/a |
| <a name="module_sqs_supplier_allocator"></a> [sqs\_supplier\_allocator](#module\_sqs\_supplier\_allocator) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip | n/a |
| <a name="module_supplier_allocator"></a> [supplier\_allocator](#module\_supplier\_allocator) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
| <a name="module_supplier_ssl"></a> [supplier\_ssl](#module\_supplier\_ssl) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.26/terraform-ssl.zip | n/a |
| <a name="module_update_letter_queue"></a> [update\_letter\_queue](#module\_update\_letter\_queue) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/v2.0.29/terraform-lambda.zip | n/a |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "domain_truststore" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-s3bucket.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-s3bucket.zip"

name = "truststore"
aws_account_id = var.aws_account_id
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Queue to transport letter status amendment messages
module "amendments_queue" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.5/terraform-sqs.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip"

name = "amendments_queue"

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Queue to transport update letter status messages. Now replaced by module.amendments_queue.
# This queue will not be removed just yet, to allow it to be drained following the release in which module.amendments_queue replaces it.
module "letter_status_updates_queue" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.5/terraform-sqs.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip"

name = "letter_status_updates_queue"

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "sqs_letter_updates" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.5/terraform-sqs.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip"

aws_account_id = var.aws_account_id
component = var.component
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "sqs_supplier_allocator" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.5/terraform-sqs.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-sqs.zip"

aws_account_id = var.aws_account_id
component = var.component
Expand Down
16 changes: 10 additions & 6 deletions infrastructure/terraform/components/api/modules_eventpub.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "eventpub" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-eventpub.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-eventpub.zip"

name = "eventpub"

Expand All @@ -21,19 +21,23 @@ module "eventpub" {
event_cache_buffer_interval = 500
enable_sns_delivery_logging = var.enable_sns_delivery_logging
sns_success_logging_sample_percent = var.sns_success_logging_sample_percent
access_logging_bucket = local.acct.s3_buckets["access_logs"]["id"]

event_cache_expiry_days = 30
enable_event_cache = var.enable_event_cache

data_plane_bus_arn = var.eventpub_data_plane_bus_arn
control_plane_bus_arn = var.eventpub_control_plane_bus_arn

access_logging_bucket = local.acct.s3_buckets["access_logs"]["id"]
data_plane_bus_arn = var.eventpub_data_plane_bus_arn
control_plane_bus_arn = var.eventpub_control_plane_bus_arn

additional_policies_for_event_cache_bucket = [
data.aws_iam_policy_document.eventcache[0].json
]

enable_event_anomaly_detection = var.enable_event_anomaly_detection
event_anomaly_band_width = var.event_anomaly_band_width
event_anomaly_evaluation_periods = var.event_anomaly_evaluation_periods
event_anomaly_period = var.event_anomaly_period
}

data "aws_iam_policy_document" "eventcache" {
count = local.event_cache_bucket_name != null ? 1 : 0
statement {
Expand Down
24 changes: 24 additions & 0 deletions infrastructure/terraform/components/api/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,27 @@ variable "enable_alarms" {
description = "Enable CloudWatch alarms for this deployed environment"
default = true
}

variable "enable_event_anomaly_detection" {
type = bool
description = "Enable CloudWatch anomaly detection alarm for SNS message Detects abnormal drops or spikes in event publishing volume."
default = true
}

variable "event_anomaly_evaluation_periods" {
type = number
description = "Number of evaluation periods for the anomaly alarm. Each period is defined by event_anomaly_period."
default = 3
}

variable "event_anomaly_period" {
type = number
description = "The period in seconds over which the specified statistic is applied for anomaly detection. Minimum 300 seconds (5 minutes). Recommended: 300-600."
default = 300
}

variable "event_anomaly_band_width" {
type = number
description = "The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4."
default = 4
}
6 changes: 5 additions & 1 deletion infrastructure/terraform/modules/eventsub/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,14 @@
| <a name="input_aws_account_id"></a> [aws\_account\_id](#input\_aws\_account\_id) | The AWS Account ID (numeric) | `string` | n/a | yes |
| <a name="input_component"></a> [component](#input\_component) | The name of the terraformscaffold component calling this module | `string` | n/a | yes |
| <a name="input_default_tags"></a> [default\_tags](#input\_default\_tags) | Default tag map for application to all taggable resources in the module | `map(string)` | `{}` | no |
| <a name="input_enable_event_anomaly_detection"></a> [enable\_event\_anomaly\_detection](#input\_enable\_event\_anomaly\_detection) | Enable CloudWatch anomaly detection alarm for SNS topic message publishing | `bool` | `true` | no |
| <a name="input_enable_event_cache"></a> [enable\_event\_cache](#input\_enable\_event\_cache) | Enable caching of events to an S3 bucket | `bool` | `true` | no |
| <a name="input_enable_firehose_raw_message_delivery"></a> [enable\_firehose\_raw\_message\_delivery](#input\_enable\_firehose\_raw\_message\_delivery) | Enables raw message delivery on firehose subscription | `bool` | `false` | no |
| <a name="input_enable_sns_delivery_logging"></a> [enable\_sns\_delivery\_logging](#input\_enable\_sns\_delivery\_logging) | Enable SNS Delivery Failure Notifications | `bool` | `true` | no |
| <a name="input_environment"></a> [environment](#input\_environment) | The name of the terraformscaffold environment the module is called for | `string` | n/a | yes |
| <a name="input_event_anomaly_band_width"></a> [event\_anomaly\_band\_width](#input\_event\_anomaly\_band\_width) | The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4. | `number` | `3` | no |
| <a name="input_event_anomaly_evaluation_periods"></a> [event\_anomaly\_evaluation\_periods](#input\_event\_anomaly\_evaluation\_periods) | Number of evaluation periods for the anomaly alarm. Each period is defined by event\_anomaly\_period. | `number` | `2` | no |
| <a name="input_event_anomaly_period"></a> [event\_anomaly\_period](#input\_event\_anomaly\_period) | The period in seconds over which the specified statistic is applied for anomaly detection. Minimum 300 seconds (5 minutes). Recommended: 300-600. | `number` | `300` | no |
| <a name="input_event_cache_buffer_interval"></a> [event\_cache\_buffer\_interval](#input\_event\_cache\_buffer\_interval) | The buffer interval for data firehose | `number` | `500` | no |
| <a name="input_event_cache_expiry_days"></a> [event\_cache\_expiry\_days](#input\_event\_cache\_expiry\_days) | s3 archiving expiry in days | `number` | `30` | no |
| <a name="input_force_destroy"></a> [force\_destroy](#input\_force\_destroy) | When enabled will force destroy event-cache S3 bucket | `bool` | `false` | no |
Expand All @@ -36,7 +40,7 @@

| Name | Source | Version |
|------|--------|---------|
| <a name="module_s3bucket_event_cache"></a> [s3bucket\_event\_cache](#module\_s3bucket\_event\_cache) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-s3bucket.zip | n/a |
| <a name="module_s3bucket_event_cache"></a> [s3bucket\_event\_cache](#module\_s3bucket\_event\_cache) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-s3bucket.zip | n/a |
## Outputs

| Name | Description |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
resource "aws_cloudwatch_metric_alarm" "subscriber_anomaly" {
count = var.enable_event_anomaly_detection ? 1 : 0

alarm_name = "${local.csi}-subscriber-anomaly"
alarm_description = "ANOMALY: Detects anomalous patterns in messages published to the SNS fanout topic"
comparison_operator = "LessThanLowerOrGreaterThanUpperThreshold"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About LessThanLower - do we want to alert on drops too? I think there'll be periods when not in use?

Copy link
Contributor Author

@aidenvaines-cgi aidenvaines-cgi Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe the way the threshold should work would be fine. we do want alerts when we get messages just unespectedly vanish. Something being updated somewhere and events just stopping is the case im most worried with

evaluation_periods = var.event_anomaly_evaluation_periods
threshold_metric_id = "ad1"
treat_missing_data = "notBreaching"

metric_query {
id = "m1"
return_data = true

metric {
metric_name = "NumberOfNotificationsDelivered"
namespace = "AWS/SNS"
period = var.event_anomaly_period
stat = "Sum"

dimensions = {
TopicName = aws_sns_topic.main.name
}
}
}

metric_query {
id = "ad1"
expression = "ANOMALY_DETECTION_BAND(m1, ${var.event_anomaly_band_width})"
label = "NumberOfNotificationsDelivered (expected)"
return_data = true
}

tags = merge(
var.default_tags,
{
Name = "${local.csi}-subscriber-anomaly"
}
)
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "s3bucket_event_cache" {
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.4/terraform-s3bucket.zip"
source = "https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.6/terraform-s3bucket.zip"

count = var.enable_event_cache ? 1 : 0

Expand Down
33 changes: 33 additions & 0 deletions infrastructure/terraform/modules/eventsub/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,39 @@ variable "sns_success_logging_sample_percent" {
default = 0
}

##
# CloudWatch Anomaly Detection Variables
##

variable "enable_event_anomaly_detection" {
type = bool
description = "Enable CloudWatch anomaly detection alarm for SNS topic message publishing"
default = true
}

variable "event_anomaly_evaluation_periods" {
type = number
description = "Number of evaluation periods for the anomaly alarm. Each period is defined by event_anomaly_period."
default = 2
}

variable "event_anomaly_period" {
type = number
description = "The period in seconds over which the specified statistic is applied for anomaly detection. Minimum 300 seconds (5 minutes). Recommended: 300-600."
default = 300
}

variable "event_anomaly_band_width" {
type = number
description = "The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4."
default = 3

validation {
condition = var.event_anomaly_band_width >= 2 && var.event_anomaly_band_width <= 10
error_message = "Band width must be between 2 and 10"
}
Comment on lines +109 to +112
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed here? Is the anomaly threshold particularly important here? Again, I think 5 is already big

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not especially, but ive tried to keep this consistent across the bounded contexts. probably more of a thing for the publisher really.
ageed on the 5 being a bit high

}

variable "log_level" {
type = string
description = "The log level to be used in lambda functions within the component. Any log with a lower severity than the configured value will not be logged: https://docs.python.org/3/library/logging.html#levels"
Expand Down
Loading
Loading