CloudWiki
Resource

CloudWatch

Amazon Web Services
Security
Amazon CloudWatch is a monitoring service that provides monitoring for AWS resources and the applications you run on the Amazon cloud. It helps you to track performance and health metrics for your resources such as EC2 instances, RDS databases, and custom metrics generated by your applications and services. With CloudWatch, you can collect, analyze, and set alarms on metrics. You can use these alarms to automatically react to issues, such as a spike in error rates or a decrease in available disk space. Additionally, CloudWatch provides you with detailed logs and events to troubleshoot issues and identify trends. CloudWatch is designed to be highly scalable and can collect, process and store millions of metrics per minute from various sources, making it a valuable tool for monitoring your AWS infrastructure and applications.
Terraform Name
terraform
aws_cloudwatch_metric_alarm
CloudWatch
attributes:

The following arguments are supported:

  • alarm_name - (Required) The descriptive name for the alarm. This name must be unique within the user's AWS account
  • comparison_operator - (Required) The arithmetic operation to use when comparing the specified Statistic and Threshold. The specified Statistic value is used as the first operand. Either of the following is supported: GreaterThanOrEqualToThreshold, GreaterThanThreshold, LessThanThreshold, LessThanOrEqualToThreshold. Additionally, the values LessThanLowerOrGreaterThanUpperThreshold, LessThanLowerThreshold, and GreaterThanUpperThreshold are used only for alarms based on anomaly detection models.
  • evaluation_periods - (Required) The number of periods over which data is compared to the specified threshold.
  • metric_name - (Optional) The name for the alarm's associated metric. See docs for supported metrics.
  • namespace - (Optional) The namespace for the alarm's associated metric. See docs for the list of namespaces. See docs for supported metrics.
  • period - (Optional) The period in seconds over which the specified statistic is applied.
  • statistic - (Optional) The statistic to apply to the alarm's associated metric. Either of the following is supported: SampleCount, Average, Sum, Minimum, Maximum
  • threshold - (Optional) The value against which the specified statistic is compared. This parameter is required for alarms based on static thresholds, but should not be used for alarms based on anomaly detection models.
  • threshold_metric_id - (Optional) If this is an alarm based on an anomaly detection model, make this value match the ID of the ANOMALY_DETECTION_BAND function.
  • actions_enabled - (Optional) Indicates whether or not actions should be executed during any changes to the alarm's state. Defaults to true.
  • alarm_actions - (Optional) The list of actions to execute when this alarm transitions into an ALARM state from any other state. Each action is specified as an Amazon Resource Name (ARN).
  • alarm_description - (Optional) The description for the alarm.
  • datapoints_to_alarm - (Optional) The number of datapoints that must be breaching to trigger the alarm.
  • dimensions - (Optional) The dimensions for the alarm's associated metric. For the list of available dimensions see the AWS documentation here.
  • insufficient_data_actions - (Optional) The list of actions to execute when this alarm transitions into an INSUFFICIENT_DATA state from any other state. Each action is specified as an Amazon Resource Name (ARN).
  • ok_actions - (Optional) The list of actions to execute when this alarm transitions into an OK state from any other state. Each action is specified as an Amazon Resource Name (ARN).
  • unit - (Optional) The unit for the alarm's associated metric.
  • extended_statistic - (Optional) The percentile statistic for the metric associated with the alarm. Specify a value between p0.0 and p100.
  • treat_missing_data - (Optional) Sets how this alarm is to handle missing data points. The following values are supported: missing, ignore, breaching and notBreaching. Defaults to missing.
  • evaluate_low_sample_count_percentiles - (Optional) Used only for alarms based on percentiles. If you specify ignore, the alarm state will not change during periods with too few data points to be statistically significant. If you specify evaluate or omit this parameter, the alarm will always be evaluated and possibly change state no matter how many data points are available. The following values are supported: ignore, and evaluate.
  • metric_query (Optional) Enables you to create an alarm based on a metric math expression. You may specify at most 20.
  • tags - (Optional) A map of tags to assign to the resource. If configured with a provider default_tags configuration block present, tags with matching keys will overwrite those defined at the provider-level.

NOTE:

If you specify at least one metric_query, you may not specify a metric_name, namespace, period or statistic. If you do not specify a metric_query, you must specify each of these (although you may use extended_statistic instead of statistic).

Nested fields

metric_query

  • id - (Required) A short name used to tie this object to the results in the response. If you are performing math expressions on this set of data, this name represents that data and can serve as a variable in the mathematical expression. The valid characters are letters, numbers, and underscore. The first character must be a lowercase letter.
  • account_id - (Optional) The ID of the account where the metrics are located, if this is a cross-account alarm.
  • expression - (Optional) The math expression to be performed on the returned data, if this object is performing a math expression. This expression can use the id of the other metrics to refer to those metrics, and can also use the id of other expressions to use the result of those expressions. For more information about metric math expressions, see Metric Math Syntax and Functions in the Amazon CloudWatch User Guide.
  • label - (Optional) A human-readable label for this metric or expression. This is especially useful if this is an expression, so that you know what the value represents.
  • return_data (Optional) Specify exactly one metric_query to be true to use that metric_query result as the alarm.
  • metric (Optional) The metric to be returned, along with statistics, period, and units. Use this parameter only if this object is retrieving a metric and not performing a math expression on returned data.

NOTE:

You must specify either metric or expression. Not both.

metric

  • dimensions - (Optional) The dimensions for this metric. For the list of available dimensions see the AWS documentation here.
  • metric_name - (Required) The name for this metric. See docs for supported metrics.
  • namespace - (Required) The namespace for this metric. See docs for the list of namespaces. See docs for supported metrics.
  • period - (Required) The period in seconds over which the specified stat is applied.
  • stat - (Required) The statistic to apply to this metric. See docs for supported statistics.
  • unit - (Optional) The unit for this metric.

Associating resources with a
CloudWatch
Resources do not "belong" to a
CloudWatch
Rather, one or more Security Groups are associated to a resource.
Create
CloudWatch
via Terraform:
The following HCL creates a CloudWatch Metric Alarm resource.
Syntax:

resource "aws_cloudwatch_metric_alarm" "foobar" {
 alarm_name                = "terraform-test-foobar5"
 comparison_operator       = "GreaterThanOrEqualToThreshold"
 evaluation_periods        = "2"
 metric_name               = "CPUUtilization"
 namespace                 = "AWS/EC2"
 period                    = "120"
 statistic                 = "Average"
 threshold                 = "80"
 alarm_description         = "This metric monitors ec2 cpu utilization"
 insufficient_data_actions = []
}

Create
CloudWatch
via CLI:
Parameters:

put-metric-alarm
--alarm-name <value>
[--alarm-description <value>]
[--actions-enabled | --no-actions-enabled]
[--ok-actions <value>]
[--alarm-actions <value>]
[--insufficient-data-actions <value>]
[--metric-name <value>]
[--namespace <value>]
[--statistic <value>]
[--extended-statistic <value>]
[--dimensions <value>]
[--period <value>]
[--unit <value>]
--evaluation-periods <value>
[--datapoints-to-alarm <value>]
[--threshold <value>]
--comparison-operator <value>
[--treat-missing-data <value>]
[--evaluate-low-sample-count-percentile <value>]
[--metrics <value>]
[--tags <value>]
[--threshold-metric-id <value>]
[--cli-input-json <value>]
[--generate-cli-skeleton <value>]
[--debug]
[--endpoint-url <value>]
[--no-verify-ssl]
[--no-paginate]
[--output <value>]
[--query <value>]
[--profile <value>]
[--region <value>]
[--version <value>]
[--color <value>]
[--no-sign-request]
[--ca-bundle <value>]
[--cli-read-timeout <value>]
[--cli-connect-timeout <value>]

Example:

aws cloudwatch put-metric-alarm --alarm-name cpu-mon --alarm-description "Alarm when CPU exceeds 70 percent" --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average --period 300 --threshold 70 --comparison-operator GreaterThanThreshold  --dimensions "Name=InstanceId,Value=i-12345678" --evaluation-periods 2 --alarm-actions arn:aws:sns:us-east-1:111122223333:MyTopic --unit Percent

aws cost
Costs
CloudWatch has a pay-as-you-go pricing model, where you only pay for what you use. The cost of CloudWatch depends on the number of metrics, alarms, and logs you collect and store. Some services, such as custom metrics and alarms, incur additional fees, while basic monitoring and alarms for most AWS services are included in the cost of those services.
Direct Cost

$3.00 per Dashboard per Month

$ per alarm metric month (standard resolution) - <Region>

$ per metric-month for the first 10,000 metrics - <Region>

$ per 1,000 metrics requested using GetMetricData API - <Region>

Indirect Cost
No items found.
Best Practices for
CloudWatch

Categorized by Availability, Security & Compliance and Cost

Low
Access allowed from VPN
No items found.
Low
Auto Scaling Group not in use
No items found.
Medium
Connections towards DynamoDB should be via VPC endpoints
No items found.
Medium
Container in CrashLoopBackOff state
No items found.
Low
EC2 with GPU capabilities
No items found.
Medium
EC2 with high privileged policies
No items found.
Medium
ECS cluster delete alarm
No items found.
Critical
ECS task with Admin access (*:*)
Medium
ECS task with high privileged policies
No items found.
Critical
EKS cluster delete alarm
No items found.
Medium
ElastiCache cluster delete alarm
No items found.
Medium
Ensure Container liveness probe is configured
No items found.
Medium
Ensure ECS task definition has memory limit
No items found.
Critical
Ensure EMR cluster master nodes are not publicly accessible
No items found.
More from
Amazon Web Services