In this hands-on guide, we’ll review what AWS NAT Gateway is, when to use it, why its costs can be so high, and how to lower your AWS bill by following a couple of simple steps to optimize your AWS NAT Gateway.
The AWS NAT Gateway is a Network Address Translation (NAT) service that enables instances within a private subnet of an Amazon Virtual Private Cloud (VPC) to connect to the internet or to other AWS services outside your VPC, while the internet nor the external services cannot initiate a connection with those instances, by blocking all ingress traffic (and allowing egress traffic)
AWS NAT Gateway is commonly used in the following scenarios:
There are two Availability Zones, with two subnets in each Availability Zone. The route table for each subnet determines how traffic is routed. In Availability Zone A, the instances in the public subnet can reach the internet through a route to the internet gateway, while the instances in the private subnet have no route to the internet. In Availability Zone B, the public subnet contains a NAT gateway, and the instances in the private subnet can reach the internet through a route to the NAT gateway in the public subnet. The NAT gateway sends the traffic to the internet gateway, using its Elastic IP address as the source IP address.
AWS NAT Gateway Pricing
The cost of an AWS NAT Gateway varies by region and is determined by 3 elements:
For additional information, see Amazon VPC Pricing.
The main element that can be most effective to handle to reduce NAT Gateway costs is to manage data transfer rigorously. To lower these costs, you need to understand what kind of data is being transferred and from what source to which destination.
So, first, identify the main contributors of traffic through your NAT gateway.
To find the top contributors to traffic through the NAT gateway in your VPC, follow these steps:
Note:
In each of the following commands, replace x.x.x.x with the private IP of your NAT gateway. Replace y.y. with the first two octets of the VPC CIDR range.
First, confirm that you have VPC Flow Logs turned on your VPC or NAT Gateway elastic network interface. You can publish flow log data to Amazon CloudWatch Logs or Amazon Simple Storage Solution (Amazon S3).
Now, you can run the appropriate queries in CloudWatch logs and in an S3 bucket using Athena. We’ll cover both methods:
1. To query in CloudWatch logs
1. Open the CloudWatch console.
2. In the navigation pane, choose Logs Insights.
3. From the dropdown list, select the log group for your NAT gateway.
4. To find the instances that are sending the most traffic through your NAT gateway, run the following query:
filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
5. To find traffic going to and from the instances, run the following query:
filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') or (srcAddr like 'x.x.x.x' and dstAddr like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
6. To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.
For uploads:
filter (srcAddr like 'x.x.x.x' and dstAddr not like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
For downloads:
filter (dstAddr like 'x.x.x.x' and srcAddr not like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
2. To query logs in an S3 bucket using Athena
Either use the Amazon VPC console or the Amazon Athena console to create a table. In this example, default is the database and vpc_flow_logs is the table.
1. To find the instances that are sending the most traffic through your NAT gateway, run the following query:
SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x' group by 1,2 order by 3 desc
limit 10;
2. To find traffic going to and from the instances, run the following query:
SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x') or (srcaddr like 'x.x.x.x' AND dstaddr like 'y.y.%') group by 1,2 order by 3 desc
limit 10;
3. To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.
For uploads:
SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'x.x.x.x' AND dstaddr not like 'y.y.%') group by 1,2 order by 3 desc
limit 10;
For downloads:
SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr not like 'y.y.%' AND dstaddr like 'x.x.x.x') group by 1,2 order by 3 desc
limit 10;
Here are some of the best practices to reduce your AWS NAT Gateway costs using various AWS services and tools:
With Stream.Security Cost, you can easily and fully understand your current cloud costs and trends across your cloud environments and accounts.
Stream.Security CloudTwin technology helps to get a complete picture of your cloud costs, with the complete context of your real-time configuration, traffic flow and even event logs.
Below, you can see a complete view of a NAT Gateway costs including the total cost, trend and direct cost.
The following image shows a specific NAT Gateway current month costs breakdown by Usage Costs
Once you get the detailed view of your AWS NAT Gateway costs, including data transfer costs, you can further investigate it using Stream.Security Network Traffic Activity logs, where you’ll get a full picture of your AWS NAT Gateway traffic volume, what source and destination caused an increase (or decrease), what application and more.
Stream.Security enriches VPC Flow Logs, allowing for the capture of information regarding IP traffic between network interfaces in your VPC. This information can include details such as source and destination IP addresses, port numbers, protocol, number of bytes and packets, and the flow's status (accepted or rejected).
This helps in monitoring the traffic passing through your NAT Gateway by gathering information on the network traffic flow in your VPC.
For more details on how to enable VPC Flow Logs logging, review the documentation on the AWS website.
The following image shows a visualized diagram of a cross account traffic going through a NAT Gateway to the Internet
Now, that you have a complete picture of your NAT Gateway costs, including details of your traffic volume, source, destination and etc, you can take proactive steps to minimize costs, such as temporarily reducing the number of instances generating traffic or taking instances that generate a high amount of traffic and placing them in the same Availability Zone as the NAT gateway to minimize cross-AZ data transfer costs.
In conclusion, reducing the cost of AWS NAT Gateways is essential for optimizing the budget for your cloud infrastructure. AWS NAT Gateways play a crucial role in enabling communication between instances in private subnets and the internet, but the cost of using them can add up quickly.
In this hands-on guide I’ve covered several best practices to reduce the cost of your AWS NAT Gateways. By following these best practices, you can reduce the cost of your AWS NAT gateway and optimize your cloud infrastructure budget.
Read Tal's previous blog posts in this series:
Reach out to Tal on LinkedIn if you'd like to suggest other topics, tips & tricks to reduce AWS cost.
Stream.Security delivers the only cloud detection and response solution that SecOps teams can trust. Born in the cloud, Stream’s Cloud Twin solution enables real-time cloud threat and exposure modeling to accelerate response in today’s highly dynamic cloud enterprise environments. By using the Stream Security platform, SecOps teams gain unparalleled visibility and can pinpoint exposures and threats by understanding the past, present, and future of their cloud infrastructure. The AI-assisted platform helps to determine attack paths and blast radius across all elements of the cloud infrastructure to eliminate gaps accelerate MTTR by streamlining investigations, reducing knowledge gaps while maximizing team productivity and limiting burnout.