Fixing AWS VPC Endpoint Routing Failures That Silently Break S3 Access

June 10, 2026 7 min read 30 views
Minimalist illustration of a cloud network routing diagram with one broken path highlighted, representing a VPC endpoint routing failure in AWS.

Your application is deployed, your EC2 instance has internet access, but calls to S3 are timing out β€” no 403, no DNS error, just silence. This is one of the most frustrating failure modes in AWS networking because the symptom gives you almost nothing to work with.

VPC endpoint routing failures are silent by design. S3 doesn't know the packet never arrived, your application just waits, and CloudWatch shows nothing obviously wrong. This guide walks you through the full diagnostic and fix process.

What you'll learn

  • How Gateway VPC endpoints for S3 work at the routing level
  • Why a correctly configured endpoint can still silently drop traffic
  • How to inspect route tables, endpoint policies, and bucket policies together
  • Specific CLI commands to verify each layer of the configuration
  • The most common misconfigurations and how to resolve them

Prerequisites

You'll need the AWS CLI configured with sufficient IAM permissions (ec2:Describe*, s3:GetBucketPolicy, and read access to VPC resources). The examples below assume a Linux shell. Basic familiarity with VPC concepts β€” subnets, route tables, security groups β€” is assumed.

How Gateway Endpoints for S3 Actually Work

A Gateway VPC endpoint for S3 is not a network interface. It's a route table entry that redirects traffic destined for S3's IP ranges through AWS's internal network instead of out to the internet gateway. When you create the endpoint, AWS injects a route with a destination of the S3 managed prefix list and a target of vpce-xxxxxxxx.

Because the mechanism is a route, anything that interferes with route evaluation will break S3 access β€” and it will break it quietly. The packet simply goes nowhere, or worse, gets routed out through the NAT gateway and fails for a different reason.

Interface endpoints (used for services like STS, ECR, or SSM) work differently: they create an ENI in your subnet and use DNS to redirect traffic. The troubleshooting approach differs enough that this article focuses on the Gateway endpoint pattern, which is the most common setup for S3.

Step 1: Confirm the Endpoint Exists and Is Available

Start with the obvious. List all VPC endpoints in the region and check their state:

aws ec2 describe-vpc-endpoints \
  --filters "Name=service-name,Values=com.amazonaws.us-east-1.s3" \
  --query "VpcEndpoints[*].{ID:VpcEndpointId,State:State,VpcId:VpcId,RouteTables:RouteTableIds}" \
  --output table

The State field should read available. If it reads pending for more than a few minutes, something went wrong during creation. If it reads deleted, someone removed it and it needs to be recreated. Note the VPC ID and the list of associated route table IDs β€” you'll need both shortly.

Step 2: Check Whether Your Subnet's Route Table Is Associated

This is the single most common source of silent failures. The endpoint exists, but the subnet your EC2 instance lives in uses a route table that was never associated with the endpoint.

Find your instance's subnet and its route table:

# Get the subnet ID for your instance
aws ec2 describe-instances \
  --instance-ids i-0123456789abcdef0 \
  --query "Reservations[0].Instances[0].SubnetId" \
  --output text

# Get the route table for that subnet
aws ec2 describe-route-tables \
  --filters "Name=association.subnet-id,Values=subnet-0abc123456789def0" \
  --query "RouteTables[*].RouteTableId" \
  --output text

Now check whether that route table ID appears in the endpoint's route table list from Step 1. If it doesn't, that's your problem. Associate the route table with the endpoint:

aws ec2 modify-vpc-endpoint \
  --vpc-endpoint-id vpce-0123456789abcdef0 \
  --add-route-table-ids rtb-0abc123456789def0

After running this, re-describe the route table and look for a route with a target starting with vpce- and a destination matching the S3 prefix list. If you see it, the route is in place.

Step 3: Inspect the Routes in the Route Table

Even if the route table is associated, the route may be in a bad state. Inspect all routes in the relevant table:

aws ec2 describe-route-tables \
  --route-table-ids rtb-0abc123456789def0 \
  --query "RouteTables[0].Routes" \
  --output json

Look for an entry where the GatewayId starts with vpce-. The State of that route should be active. If the state is blackhole, the endpoint has been deleted but the route wasn't cleaned up β€” you'll need to delete the stale route manually and recreate the endpoint.

Also confirm there is no more-specific route sending S3 traffic elsewhere. If someone added an explicit route for an S3 IP range to the internet gateway or a NAT gateway, that more-specific route will win over the prefix list route. Remove the conflicting entry if you find one.

Step 4: Review the Endpoint Policy

Gateway endpoints support resource-based policies. By default the policy allows full access, but it's common for teams to tighten this down and accidentally block the principal or action they need.

Retrieve the current endpoint policy:

aws ec2 describe-vpc-endpoints \
  --vpc-endpoint-ids vpce-0123456789abcdef0 \
  --query "VpcEndpoints[0].PolicyDocument" \
  --output text | python3 -m json.tool

Check two things. First, does the Principal field cover the IAM role or user your application assumes? A policy that only allows a specific role ARN will silently deny everything else β€” and the denial shows up as a timeout, not a 403, because the packet is dropped before S3 even sees it. Second, does the Action list include everything your application needs? A policy that only allows s3:GetObject will break any code that calls s3:ListBucket or s3:PutObject.

A minimal open policy that restores default behavior looks like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "*"
    }
  ]
}

Apply it with:

aws ec2 modify-vpc-endpoint \
  --vpc-endpoint-id vpce-0123456789abcdef0 \
  --policy-document file://open-policy.json

Step 5: Check the S3 Bucket Policy for VPC Conditions

A bucket policy can restrict access to requests that originate from a specific VPC or VPC endpoint. If someone added a Condition block using aws:SourceVpc or aws:SourceVpce and you're hitting the bucket from a different VPC or endpoint, every request will be denied.

aws s3api get-bucket-policy \
  --bucket your-bucket-name \
  --query Policy \
  --output text | python3 -m json.tool

Look for any Condition using StringEquals on aws:SourceVpce or aws:SourceVpc. If the value doesn't match your current endpoint ID or VPC ID, the bucket will reject requests with a 403. Fix the condition to reference the correct endpoint ID, or add your endpoint to the allowed list if multiple VPCs need access.

Step 6: Validate with a Quick Test From the Instance

Once you've made changes, verify they work from inside the VPC rather than from your laptop. SSH into the affected instance and run:

# Check if the request is routed through the endpoint (not via internet)
curl -v https://s3.amazonaws.com/your-bucket-name/ 2>&1 | head -30

# Or use the AWS CLI directly
aws s3 ls s3://your-bucket-name/ --region us-east-1

To confirm traffic is actually going through the endpoint rather than the internet gateway, check the VPC Flow Logs for the subnet. Requests routed through a Gateway endpoint will appear with a destination in the S3 prefix list and no corresponding entry on the internet gateway's ENI.

If you don't have Flow Logs enabled, enable them temporarily on the subnet and reproduce the failure. The logs will tell you exactly where the traffic goes:

aws ec2 create-flow-logs \
  --resource-type Subnet \
  --resource-ids subnet-0abc123456789def0 \
  --traffic-type ALL \
  --log-destination-type cloud-watch-logs \
  --log-group-name /vpc/flowlogs/debug \
  --deliver-logs-permission-arn arn:aws:iam::123456789012:role/FlowLogsRole

Common Pitfalls

Multiple route tables in the same VPC

Large VPCs often have separate route tables for public subnets, private subnets, and isolated subnets. The endpoint association UI makes it easy to associate only the main route table. Audit all route tables in the VPC and associate every one that serves subnets running workloads that need S3.

Endpoint and bucket in different regions

A VPC endpoint in us-east-1 only covers S3 in us-east-1. If your application calls a bucket in eu-west-1, traffic won't go through the endpoint β€” it'll try to reach the internet and fail if there's no internet gateway or NAT. Either create a second endpoint in the relevant region, or explicitly allow the cross-region traffic path.

DNS resolution returning public IPs

Unlike Interface endpoints, Gateway endpoints do not need private DNS. S3 DNS always returns public IPs, but the route table ensures those IPs are reachable via the endpoint without touching the internet. If you've disabled DNS resolution or DNS hostnames on the VPC, other services may break, but this setting doesn't directly affect Gateway endpoints.

Restrictive endpoint policies inherited from a template

Infrastructure-as-code templates (Terraform modules, CloudFormation stacks) sometimes ship with a locked-down endpoint policy as a security default. If you've deployed an endpoint through a module without reading its defaults, the policy may be restricting access in ways the original developer didn't document clearly. Always print and read the current policy before assuming it's open.

Next Steps

After you've restored access, put these practices in place to avoid repeating the diagnosis next time:

  • Tag your endpoint with the VPCs and route tables it covers so future engineers can audit coverage without running CLI commands.
  • Enable VPC Flow Logs on your private subnets permanently and route them to S3 or CloudWatch Logs. The cost is low and the debugging value is high.
  • Add a Config rule or a custom AWS Config conformance pack that alerts when a subnet's route table is not associated with the S3 endpoint.
  • Document the endpoint policy in your infrastructure repo and add it to your code review checklist. Unexplained policy changes are a common source of these failures.
  • Test access from within the VPC as part of your deployment pipeline, not just from CI runners that sit outside it. A simple aws s3 ls call from a post-deploy Lambda or ECS task will catch routing failures before they reach production traffic.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.