How to Debug AWS Fargate Containers with ECS Exec?
You moved to Fargate. No more SSH. No more docker exec. Your container is failing and you can't get inside. ECS Exec — AWS's answer to docker exec for Fargate — has been here since 2021. This guide covers setup, the 5 IAM permissions that catch everyone, and the commands that work.
- 01ECS Exec uses SSM Session Manager — bind-mounts an agent into your container, no sidecar needed
- 02Requires 3 things: --enable-execute-command on the service, IAM task role with SSM permissions, and the SSM Session Manager plugin on your local CLI
- 03The #1 failure point is IAM — the task role needs ssmmessages permissions, not just ecs:ExecuteCommand
- 0420-minute idle timeout, 1 session per container, root user — know the limits before you rely on it in production
- 05CloudTrail logs every ExecuteCommand call. S3 and CloudWatch can capture command output for compliance
Why ECS Exec exists — the Fargate debugging gap
Fargate has no hosts to SSH into. ECS Exec bind-mounts the SSM agent at runtime, giving you an interactive shell without ports, keys, or changing your task definition. Available since Nov 2021 on both Linux and Windows containers.
Before ECS Exec (launched March 2021), debugging a Fargate container meant you couldn't get a shell at all — there are no EC2 instances to SSH into. Fargate runs your tasks on AWS-managed infrastructure, which has real implications for operating Fargate at scale. ECS Exec was the #1 most requested feature on the AWS Containers Roadmap for good reason.
Download the skill file — check readiness first
Before hitting one of the 5 errors below — a skill file your AI agent can run. It checks IAM permissions, the SSM plugin, networking, and the read-only-filesystem trap, all at once. Everything runs locally against your AWS account.
The 5 errors that catch everyone
Every team hits these. The error messages are cryptic but the fixes are specific — missing enable flag, wrong IAM role, missing SSM plugin, no VPC route, or read-only root filesystem. Each has a one-line resolution.
# Update the service to enable ECS Exec
aws ecs update-service \
--cluster your-cluster \
--service your-service \
--enable-execute-command \
--force-new-deployment
# Or for a standalone task:
aws ecs run-task \
--cluster your-cluster \
--task-definition your-task \
--enable-execute-command{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ssmmessages:CreateControlChannel",
"ssmmessages:CreateDataChannel",
"ssmmessages:OpenControlChannel",
"ssmmessages:OpenDataChannel"
],
"Resource": "*"
}]
}# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip" && \
unzip session.zip && sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
# Linux
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/linux_64bit/session-manager-plugin.rpm" -o "plugin.rpm" && \
sudo yum install -y plugin.rpm# Option A: Add NAT Gateway to route traffic to internet
# Option B: Create VPC endpoints for SSM (recommended for private subnets)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-xxx \
--service-name com.amazonaws.region.ssmmessages \
--subnet-ids subnet-xxx# In your task definition, set:
"linuxParameters": {
"initProcessEnabled": true
}
# And remove or set to false:
"readonlyRootFilesystem": falseThe happy path — step by step
# macOS
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "session.zip"
unzip session.zip
sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
# Verify
session-manager-plugin --versionThis is the policy the container needs to call SSM. Attach it to your ECS task role (NOT the execution role — that's for pulling images and writing logs).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssmmessages:CreateControlChannel",
"ssmmessages:CreateDataChannel",
"ssmmessages:OpenControlChannel",
"ssmmessages:OpenDataChannel"
],
"Resource": "*"
}
]
}aws ecs update-service \
--cluster my-cluster \
--service my-service \
--enable-execute-command \
--force-new-deploymentaws ecs describe-tasks \
--cluster my-cluster \
--tasks $(aws ecs list-tasks --cluster my-cluster --service my-service --query 'taskArns[0]' --output text)
# Look for:
# "enableExecuteCommand": true
# "lastStatus": "RUNNING" under ExecuteCommandAgent# Interactive shell
aws ecs execute-command \
--cluster my-cluster \
--task YOUR_TASK_ID \
--container nginx \
--command "/bin/bash" \
--interactive
# Single command
aws ecs execute-command \
--cluster my-cluster \
--task YOUR_TASK_ID \
--container nginx \
--command "env | grep DATABASE" \
--interactiveProduction setup — logging, audit, security
Three layers control ECS Exec in production: S3/CloudWatch logging captures every command, CloudTrail audits who ran what, and IAM conditions restrict exec by container name, cluster, and tags — including denying production access entirely.
ECS Exec is powerful — and you need controls around it in production. Three layers: logging (what commands ran), auditing (who ran them), and access control (who CAN run them). If you enable CloudWatch logging, set a retention policy — unbounded CloudWatch log groups accumulate real cost on active clusters.
Configure at the cluster level. Two destinations: S3 for durable retention, CloudWatch for real-time search. CloudTrail separately logs the ExecuteCommand API call (who and when). Together they give you full visibility: CloudTrail = who executed. S3/CloudWatch = what they ran.
aws ecs update-cluster \
--cluster my-cluster \
--configuration executeCommandConfiguration='{
"logging": "OVERRIDE",
"logConfiguration": {
"cloudWatchLogGroupName": "/aws/ecs/my-cluster-exec",
"s3BucketName": "my-exec-logs",
"s3KeyPrefix": "exec-output"
}
}'Use IAM condition keys on ecs:ExecuteCommand. This policy allows exec only on tasks tagged environment=development in a specific cluster. Production tasks are blocked — even if someone has the right IAM role.
{
"Effect": "Allow",
"Action": "ecs:ExecuteCommand",
"Resource": [
"arn:aws:ecs:us-east-1:123456789:cluster/my-cluster",
"arn:aws:ecs:us-east-1:123456789:task/my-cluster/*"
],
"Condition": {
"StringEquals": {
"ecs:ResourceTag/environment": "development"
}
}
}Add a Deny policy that blocks exec on any container named production-app — regardless of IAM role. This is the safety net. Even if someone tags a task wrong, the container name catches it.
{
"Effect": "Deny",
"Action": "ecs:ExecuteCommand",
"Resource": "*",
"Condition": {
"StringEquals": {
"ecs:container-name": "production-app"
}
}
}What ECS Exec can't do
Seven hard limits: 20-minute idle timeout, one session per PID namespace, must be enabled at launch, read-only root FS blocks the agent, commands run as root, no AWS Console support, only tools in the image are available.
"ECS Exec sessions drop after 20 minutes of idle time — this timeout is not configurable. Only one session per container PID namespace is supported, and sessions always run as root regardless of the container USER directive."
— AWS ECS Exec documentation, verified June 2026
FAQ
If you read this, you might also want to know
Can I use ECS Exec on EC2 launch type?
Yes — it works identically on EC2 and Fargate. On EC2, you need the latest ECS-optimized AMI with the SSM agent pre-installed. If you're using a custom AMI, you need to install the SSM agent and session-manager-plugin yourself. The IAM setup is the same for both launch types.
How do I log ECS Exec sessions for compliance?
Three pieces: (1) CloudTrail captures the ExecuteCommand API call — who, when, which task; (2) S3 or CloudWatch Logs capture command output — configure at cluster level with logging=OVERRIDE; (3) KMS encryption for the data channel — add kmsKeyId to your executeCommandConfiguration. Together they satisfy SOC 2 audit requirements.
Is there a way to debug without ECS Exec — like a sidecar approach?
Some teams run a debug sidecar container (e.g. alpine with curl/netcat) in the same task for diagnostics. This works without enabling ECS Exec but requires modifying your task definition. ECS Exec is simpler because it doesn't change your task definition — it's a runtime feature, not a deployment change.
Does ECS Exec work with ECS Anywhere?
Yes — ECS Exec is supported on external instances (ECS Anywhere) for both Linux and Windows containers. The SSM agent must be installed on the external instance alongside the ECS agent. The IAM and setup requirements are the same.
Operating the fleet
is the rest.
ECS Exec solves one problem — getting a shell. Fleet scheduling, cost visibility, environment cloning, and developer self-service are the next ones. Fortem handles the fleet so you don't have to debug at 2am.