Why Your AWS NAT Gateway Bill Spikes Without Extra Traffic (And How to Fix It)

June 01, 2026 5 min read 38 views
Minimalist illustration of a cloud VPC network diagram with interconnected nodes representing AWS infrastructure and data routing paths

Your AWS bill arrived and the NAT Gateway line item doubled. You check your application metrics — request counts are flat, user traffic is unchanged. Nothing looks different. So why is the bill higher?

NAT Gateways charge you in two separate ways, and most engineers only think about one of them. Once you understand the billing model and know where to look, you can usually identify the source of a spike within an hour and reduce it significantly without touching your application logic.

What You'll Learn

  • How NAT Gateway pricing actually works (both dimensions)
  • The most common hidden sources of data processing charges
  • How cross-AZ traffic silently inflates your bill
  • How to use VPC Flow Logs and Cost Explorer to find the culprit
  • Concrete fixes you can apply today

How NAT Gateway Pricing Works

NAT Gateway billing has two components. The first is an hourly rate for each gateway you have running, regardless of whether it carries any traffic. That part is predictable. The second — and the one that causes surprise spikes — is a per-gigabyte data processing charge applied to every byte that passes through the gateway in either direction.

The critical detail is that this charge applies to all traffic processed, not just traffic that leaves your VPC to the internet. If a service in a private subnet makes a request to an S3 bucket and the traffic routes through the NAT Gateway instead of a VPC endpoint, you pay data processing fees on every byte of that response. For a service pulling large files or making frequent API calls, that adds up fast.

The Most Common Culprit: Internal AWS Traffic Routed Through NAT

This is responsible for a large share of unexpected NAT Gateway cost spikes. AWS services like S3, DynamoDB, CloudWatch, ECR, Secrets Manager, and SSM all have VPC endpoint options. If you haven't configured those endpoints, every call from a private subnet goes out through the NAT Gateway.

A common scenario: a container workload starts pulling images from ECR more frequently due to more frequent deployments or autoscaling. ECR image layers can be hundreds of megabytes. Each pull routes through NAT if no endpoint exists, and you pay data processing fees on every byte.

Check your current VPC endpoints with the AWS CLI:

aws ec2 describe-vpc-endpoints \
  --filters "Name=vpc-id,Values=vpc-xxxxxxxx" \
  --query 'VpcEndpoints[*].{Service:ServiceName,Type:VpcEndpointType,State:State}' \
  --output table

If you see no entries for S3, ECR, or DynamoDB, those services are going through your NAT Gateway.

Cross-AZ Traffic: The Bill Multiplier Nobody Talks About

AWS charges for data that crosses Availability Zone boundaries. If you have a single NAT Gateway in us-east-1a and instances in us-east-1b and us-east-1c routing through it, you pay both the NAT data processing charge and the cross-AZ data transfer charge on every byte. That can effectively double or triple the cost of NAT-processed traffic.

The fix is to deploy one NAT Gateway per AZ and update each subnet's route table to use the gateway in its own AZ. Yes, this means paying more hourly charges for the additional gateways, but for workloads with meaningful traffic volume, eliminating the cross-AZ transfer fees typically results in a lower total bill.

Here's the routing pattern to aim for:

  • Private subnets in us-east-1a → NAT Gateway in us-east-1a
  • Private subnets in us-east-1b → NAT Gateway in us-east-1b
  • Private subnets in us-east-1c → NAT Gateway in us-east-1c

You can verify which route table a subnet uses and update it via Terraform, CloudFormation, or the console. The change is non-disruptive for existing connections.

Finding the Source with VPC Flow Logs

Before you change anything, you need to know what traffic is actually flowing through your NAT Gateway. VPC Flow Logs are the right tool for this.

Enable flow logs on your NAT Gateway's subnet (or the entire VPC) and send them to CloudWatch Logs or S3. Once you have data, you can query it with CloudWatch Logs Insights or Athena.

A useful CloudWatch Logs Insights query to find top destination IPs by bytes transferred:

fields @timestamp, srcAddr, dstAddr, bytes, action
| filter action = "ACCEPT"
| stats sum(bytes) as totalBytes by dstAddr
| sort totalBytes desc
| limit 20

Look at the top destinations. If you see IP ranges belonging to AWS services (you can look these up in the AWS IP ranges JSON file), those are candidates for VPC endpoint migration. If you see unexpected external IPs with high byte counts, you may have a service making calls you didn't know about — a telemetry agent, a license check, or a dependency phoning home.

Using Cost Explorer to Narrow Down the Time Window

Before diving into flow logs, use Cost Explorer to identify when the spike started. Filter by service (EC2 - Other is where NAT Gateway charges appear), set the granularity to daily, and look for the inflection point.

Cross that date against your deployment history. Did a new service launch? Did autoscaling add capacity? Did a cron job start running more frequently? Often the cost spike correlates directly with a specific change, and finding that change tells you which workload to investigate first.

You can also break down the EC2-Other costs by usage type. Look for entries containing NatGateway-Bytes — these are the data processing charges. If they're growing faster than your NatGateway-Hours charges, you have a data volume problem, not a gateway count problem.

Setting Up VPC Endpoints to Eliminate NAT Traffic

Gateway endpoints for S3 and DynamoDB are free. Interface endpoints for other services cost a small hourly fee per AZ, but they eliminate NAT data processing charges and cross-AZ fees for that traffic, which usually makes them cheaper at any meaningful scale.

To create a gateway endpoint for S3 using the AWS CLI:

aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxx \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-xxxxxxxx rtb-yyyyyyyy \
  --vpc-endpoint-type Gateway

For ECR (which uses two endpoints), you need both com.amazonaws.REGION.ecr.api and com.amazonaws.REGION.ecr.dkr. ECR also requires an S3 gateway endpoint for image layer storage, so set that up first.

After creating endpoints, re-run your flow log query. Traffic to those AWS service IP ranges should drop significantly or disappear from your NAT Gateway logs within minutes.

Common Pitfalls When Diagnosing NAT Costs

Flow logs have a delay

VPC Flow Logs don't appear in real time. Depending on your configuration, logs may be 10 to 15 minutes behind. Don't assume an endpoint change had no effect just because you don't see it immediately in the logs.

Costs appear under EC2, not NAT Gateway

In Cost Explorer and on your bill, NAT Gateway charges show up under the EC2 service category, not a separate NAT Gateway category. Filter by usage type containing

📤 Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

📬 Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.