Optimizing AWS EKS: Karpenter Autoscaling & IRSA Pod Identity

Deploy EKS with Karpenter autoscaling and IRSA pod identity. Achieve efficient, secure, and scalable Kubernetes on AWS. A complete guide.

Overview

In the dynamic landscape of cloud-native computing, Kubernetes has emerged as the de facto orchestrator for containerized applications. AWS Elastic Kubernetes Service (EKS) provides a managed Kubernetes experience, abstracting away much of the operational burden of the control plane. However, effectively managing the underlying worker nodes for optimal performance, cost-efficiency, and resilience remains a critical challenge. Traditional autoscaling solutions, while functional, often struggle with the nuances of diverse workloads and rapid scaling requirements. This is where Karpenter, an open-source, high-performance Kubernetes node autoscaler built by AWS, truly shines.

Karpenter fundamentally rethinks node provisioning. Unlike the Kubernetes Cluster Autoscaler, which scales existing node groups, Karpenter directly interfaces with the cloud provider's compute service (EC2 in AWS's case) to launch the optimal compute resources in response to unschedulable pods. This direct integration allows Karpenter to provision nodes rapidly, select the most appropriate instance types (including spot instances), and consolidate existing nodes to optimize costs, all while adhering to pod scheduling requirements like resource requests, node selectors, taints, and tolerations.

Beyond efficient resource management, security is paramount in any cloud environment. Granting Kubernetes pods access to AWS services securely and with the principle of least privilege is a non-negotiable requirement. Traditionally, this involved managing AWS credentials within pods, a practice fraught with security risks. IAM Roles for Service Accounts (IRSA) revolutionizes this by allowing you to associate an AWS IAM role with a Kubernetes service account. This enables pods using that service account to seamlessly assume the IAM role and inherit its permissions, without needing to store or manage AWS credentials directly. This significantly enhances the security posture and simplifies credential management for your applications running on EKS.

This article, penned for senior technologists and cloud architects, will guide you through the process of setting up an AWS EKS cluster, integrating it with Karpenter for intelligent, demand-driven node autoscaling, and securing your applications' interactions with AWS services using IRSA. We will delve into practical, step-by-step implementation, complete with real-world commands and configuration examples, ensuring you have a robust, secure, and cost-optimized EKS environment.

Prerequisites

Before we embark on this journey, ensure you have the following tools and configurations in place:

AWS Account: An active AWS account with administrative access to create EKS clusters, IAM roles, and EC2 instances.
AWS CLI: Version 2.x installed and configured with appropriate credentials.
```
aws --version
aws configure
```
kubectl: The Kubernetes command-line tool, installed and configured to interact with your EKS cluster.
```
kubectl version --client
```
eksctl: The official CLI for Amazon EKS. This tool simplifies EKS cluster creation and management significantly.
```
eksctl version
```
Helm: The package manager for Kubernetes, essential for installing Karpenter.
```
helm version
```
jq: A lightweight and flexible command-line JSON processor, useful for parsing AWS CLI output.
```
jq --version
```
Basic Understanding: Familiarity with Kubernetes concepts (pods, deployments, service accounts), AWS IAM, and fundamental networking concepts (VPC, subnets, security groups).

Step-by-step Implementation

1. Create an EKS Cluster with `eksctl`

We'll start by creating an EKS cluster using eksctl. Crucially, we will initially create the cluster *without* any managed node groups. This allows Karpenter to take full control of provisioning all worker nodes based on demand.

Create a file named eks-karpenter-cluster.yaml with the following content. Remember to choose a unique cluster name and your desired AWS region.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: tech-news-karpenter-cluster
  region: us-east-1
  version: "1.28"

vpc:
  id: "vpc-0123456789abcdef0" # Replace with your existing VPC ID or omit to create a new one
  subnets:
    public:
      us-east-1a: { id: "subnet-0abcdef1234567890" } # Replace with your public subnet IDs
      us-east-1b: { id: "subnet-0fedcba9876543210" }
    private:
      us-east-1a: { id: "subnet-0123456789abcdef1" } # Replace with your private subnet IDs
      us-east-1b: { id: "subnet-0fedcba9876543211" }

iam:
  withOIDC: true # Required for IRSA

# We explicitly define no node groups here. Karpenter will manage them.
# If you need a minimal node group for initial Karpenter deployment, you can add it,
# but for a pure Karpenter-managed setup, omit it.
# nodeGroups:
#   - name: karpenter-initial-ng
#     instanceType: t3.medium
#     minSize: 1
#     maxSize: 1
#     desiredCapacity: 1
#     volumeSize: 20
#     privateNetworking: true
#     labels: { role: initial-karpenter-node }

cloudWatch:
  clusterLogging:
    enableTypes: ["api", "audit", "authenticator", "controllerManager", "scheduler"]

Note: If you omit the vpc section, eksctl will create a new VPC for you. If you provide existing VPC and subnet IDs, ensure they are correctly configured for EKS (e.g., sufficient IP addresses, route tables, and NAT gateways for private subnets). For a production setup, using private subnets for worker nodes is highly recommended.

Now, create the cluster:

eksctl create cluster -f eks-karpenter-cluster.yaml

This command will take 15-25 minutes to complete. Once finished, verify your cluster context:

kubectl get svc

2. Configure IAM for Karpenter

Karpenter needs specific IAM permissions to launch EC2 instances, manage security groups, and interact with other AWS services on your behalf. We will create an IAM role for Karpenter and associate it with a Kubernetes Service Account using IRSA.

2.1. Retrieve Cluster Details

We need your cluster's OIDC provider URL and the EKS service account issuer.

CLUSTER_NAME="tech-news-karpenter-cluster"
AWS_REGION="us-east-1"

# Get OIDC Issuer URL
OIDC_ISSUER=$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text --region ${AWS_REGION})
echo "OIDC Issuer: ${OIDC_ISSUER}"

# Get OIDC Provider ID
OIDC_ID=$(echo "${OIDC_ISSUER}" | cut -d '/' -f 5)
echo "OIDC ID: ${OIDC_ID}"

2.2. Create Karpenter IAM Role and Service Account

We'll use eksctl to simplify the creation of the IAM role, policy, and Kubernetes Service Account with IRSA.

eksctl create iamserviceaccount \
  --cluster ${CLUSTER_NAME} \
  --name karpenter \
  --namespace karpenter \
  --role-name karpenter-controller-${CLUSTER_NAME} \
  --attach-policy-arn "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy" \
  --attach-policy-arn "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController" \
  --attach-policy-arn "arn:aws:iam::aws:policy/AWSLoadBalancerControllerIAMPolicy" \
  --append-tags "eks.amazonaws.com/role-arn=arn:aws:iam::$(aws sts get-caller-identity --query Account --output text):role/karpenter-controller-${CLUSTER_NAME}" \
  --override-existing-serviceaccounts \
  --approve \
  --region ${AWS_REGION}

The above command creates an IAM role named karpenter-controller-tech-news-karpenter-cluster and a Kubernetes Service Account named karpenter in the karpenter namespace. It attaches some common EKS policies, though Karpenter requires a more specific policy.

Let's get the ARN of the IAM role created by eksctl:

KARPENTER_IAM_ROLE_ARN=$(aws iam get-role --role-name karpenter-controller-${CLUSTER_NAME} --query "Role.Arn" --output text)
echo "Karpenter IAM Role ARN: ${KARPENTER_IAM_ROLE_ARN}"

2.3. Create and Attach Karpenter Controller Policy

Karpenter requires a specific policy to manage EC2 instances. Save the following JSON as karpenter-controller-policy.json:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateLaunchTemplate",
                "ec2:CreateFleet",
                "ec2:RunInstances",
                "ec2:CreateTags",
                "ec2:TerminateInstances",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeInstances",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeInstanceTypeOfferings",
                "ec2:DescribeAvailabilityZones",
                "ec2:DeleteLaunchTemplate",
                "ec2:DescribeImages",
                "ec2:DescribeSpotPriceHistory"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "ssm:GetParameter",
            "Resource": "arn:aws:ssm:*:*:parameter/aws/service/ami-amazon-linux-2-recommended/image_id"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "${KARPENTER_IAM_ROLE_ARN}",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "ec2.amazonaws.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": "iam:CreateServiceLinkedRole",
            "Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
            "Condition": {
                "StringLike": {
                    "iam:AWSServiceName": "spot.amazonaws.com"
                }
            }
        }
    ]
}

Note: Replace ${KARPENTER_IAM_ROLE_ARN} with the actual ARN you retrieved earlier. For simplicity, you can directly embed the ARN in the file or use a templating approach.

Create the IAM policy:

aws iam create-policy \
    --policy-name KarpenterControllerPolicy-${CLUSTER_NAME} \
    --policy-document file://karpenter-controller-policy.json \
    --region ${AWS_REGION}

Attach this policy to the Karpenter IAM role:

KARPENTER_POLICY_ARN=$(aws iam list-policies --scope Local --query "Policies[?PolicyName=='KarpenterControllerPolicy-${CLUSTER_NAME}'].Arn" --output text --region ${AWS_REGION})

aws iam attach-role-policy \
    --role-name karpenter-controller-${CLUSTER_NAME} \
    --policy-arn ${KARPENTER_POLICY_ARN} \
    --region ${AWS_REGION}

3. Install Karpenter

Now we'll install Karpenter using Helm.

3.1. Add Helm Repository

helm repo add karpenter https://charts.karpenter.sh/
helm repo update

3.2. Get Cluster Endpoint and Certificates

Karpenter needs to interact with the EKS control plane.

CLUSTER_ENDPOINT=$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text --region ${AWS_REGION})
echo "Cluster Endpoint: ${CLUSTER_ENDPOINT}"

# Get the cluster CA for Helm installation
CLUSTER_CA=$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.certificateAuthority.data" --output text --region ${AWS_REGION})
echo "Cluster CA (truncated): ${CLUSTER_CA:0:30}..."

3.3. Install Karpenter Helm Chart

Install Karpenter, configuring it to use the IAM role and service account we created.

helm install karpenter karpenter/karpenter \
  --namespace karpenter --create-namespace \
  --set serviceAccount.create=false \
  --set serviceAccount.name=karpenter \
  --set clusterName=${CLUSTER_NAME} \
  --set clusterEndpoint=${CLUSTER_ENDPOINT} \
  --set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
  --wait

Note: The --set aws.defaultInstanceProfile refers to an EC2 Instance Profile that Karpenter will use for newly provisioned nodes. We need to create this profile.

3.4. Create Karpenter Node Instance Profile

Karpenter-provisioned nodes need an IAM role to join the EKS cluster and interact with AWS services like ECR.

First, create an IAM role for the nodes:

aws iam create-role \
  --role-name KarpenterNodeRole-${CLUSTER_NAME} \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ec2.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
  }' \
  --region ${AWS_REGION}

Attach necessary policies for EKS worker nodes:

aws iam attach-role-policy \
    --role-name KarpenterNodeRole-${CLUSTER_NAME} \
    --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy \
    --region ${AWS_REGION}

aws iam attach-role-policy \
    --role-name KarpenterNodeRole-${CLUSTER_NAME} \
    --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly \
    --region ${AWS_REGION}

aws iam attach-role-policy \
    --role-name KarpenterNodeRole-${CLUSTER_NAME} \
    --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy \
    --region ${AWS_REGION}

Finally, create the Instance Profile:

aws iam create-instance-profile \
    --instance-profile-name KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
    --region ${AWS_REGION}

aws iam add-role-to-instance-profile \
    --instance-profile-name KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
    --role-name KarpenterNodeRole-${CLUSTER_NAME} \
    --region ${AWS_REGION}

3.5. Define Karpenter Provisioner and AWSNodeTemplate

These custom resources tell Karpenter how to provision nodes.

Create karpenter-provisioner.yaml:

apiVersion: karpenter.k8s.aws/v1beta1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  amiFamily: AL2 # Can be AL2, AL2023, Bottlerocket, Ubuntu, Windows
  instanceProfile: KarpenterNodeInstanceProfile-${CLUSTER_NAME} # Must match the instance profile created above
  securityGroupSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME} # Automatically discovers EKS-created security groups
  subnetSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME} # Automatically discovers EKS-created subnets
  tags:
    karpenter.sh/cluster-name: ${CLUSTER_NAME}
    karpenter.sh/cluster-provisioner: "true" # Tag for Karpenter to recognize its own nodes
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: "20Gi"
        volumeType: gp3
        encrypted: true
---
apiVersion: karpenter.sh/v1beta1
kind: Provisioner
metadata:
  name: default
spec:
  # Provisioner can provision nodes from any of the subnets matching the selector.
  # Subnet selector tags are automatically added by eksctl if you use it to create the cluster.
  # For existing VPCs, ensure your subnets are tagged: kubernetes.io/cluster/${CLUSTER_NAME}: shared
  # For private subnets: kubernetes.io/role/internal-elb: 1
  # For public subnets: kubernetes.io/role/elb: 1
  # For Karpenter to use private subnets, ensure they have access to ECR and EKS endpoints (e.g., via VPC Endpoints or NAT Gateway).
  requirements:
    - key: karpenter.k8s.aws/instance-type
      operator: In
      values: ["t3.medium", "t3.large", "m5.large", "m5.xlarge", "c5.large", "c5.xlarge"]
    - key: karpenter.k8s.aws/zone
      operator: In
      values: ["us-east-1a", "us-east-1b"] # Or your desired availability zones
    - key: kubernetes.io/arch
      operator: In
      values: ["amd64"]
    - key: karpenter.sh/capacity-type # Can be On-Demand or Spot
      operator: In
      values: ["on-demand", "spot"]
  limits:
    resources:
      cpu: "100" # Max 100 CPU cores for this provisioner
  providerRef:
    name: default # Refers to the AWSNodeTemplate named 'default'
  ttlSecondsAfterEmpty: 30 # Nodes will be deprovisioned 30 seconds after they become empty
  consolidation:
    enabled: true # Karpenter will try to consolidate nodes for cost optimization

Apply the configuration:

kubectl apply -f karpenter-provisioner.yaml

4. Deploy a Sample Application to Test Autoscaling

Let's deploy a demanding application that will trigger Karpenter to provision new nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 10
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
      # Node Selector example, if you want to target specific nodes provisioned by Karpenter
      # nodeSelector:
      #   karpenter.sh/capacity-type: on-demand

Save this as inflate-deployment.yaml and apply it:

kubectl apply -f inflate-deployment.yaml

Monitor the pods and nodes:

kubectl get pods -w
kubectl get nodes -w

You should see some pods stuck in a Pending state initially. Within a minute or two, Karpenter should detect the unschedulable pods and provision new EC2 instances. Once the instances are ready and join the cluster, the pending pods will be scheduled.

To observe Karpenter logs:

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter

After scaling up, you can scale down the deployment to see Karpenter consolidate or terminate nodes:

kubectl scale deployment inflate --replicas=0
kubectl delete -f inflate-deployment.yaml

Wait for ttlSecondsAfterEmpty (30 seconds in our config) and observe Karpenter terminating the provisioned nodes.

5. Implement IRSA for Pod Identity

Now, let's demonstrate IRSA by creating a pod that needs access to AWS S3 buckets. We will grant it read-only access to S3.

5.1. Create an IAM Policy for S3 Read-Only Access

Save the following JSON as s3-read-policy.json:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "*"
        }
    ]
}

Create the IAM policy:

aws iam create-policy \
    --policy-name S3ReadOnlyPolicyForApp \
    --policy-document file://s3-read-policy.json \
    --region ${AWS_REGION}

Get the ARN of the created policy:

S3_READ_POLICY_ARN=$(aws iam list-policies --scope Local --query "Policies[?PolicyName=='S3ReadOnlyPolicyForApp'].Arn" --output text --region ${AWS_REGION})
echo "S3 Read Policy ARN: ${S3_READ_POLICY_ARN}"

5.2. Create IAM Role and Kubernetes Service Account with IRSA

We'll use eksctl to create the IAM role and associate it with a Kubernetes Service Account.

eksctl create iamserviceaccount \
  --cluster ${CLUSTER_NAME} \
  --name s3-reader-sa \
  --namespace default \
  --role-name s3-reader-role-${CLUSTER_NAME} \
  --attach-policy-arn ${S3_READ_POLICY_ARN} \
  --override-existing-serviceaccounts \
  --approve \
  --region ${AWS_REGION}

This command creates an IAM role s3-reader-role-tech-news-karpenter-cluster and a Kubernetes Service Account s3-reader-sa in the default namespace, linking them via IRSA.

5.3. Deploy a Sample Application Using the IRSA Service Account

Now, deploy a simple pod that uses the s3-reader-sa service account. This pod will have the S3 read-only permissions.

apiVersion: v1
kind: Pod
metadata:
  name: s3-reader-pod
spec:
  serviceAccountName: s3-reader-sa # This is the key for IRSA
  containers:
  - name: awscli
    image: amazon/aws-cli:latest
    command: ["tail", "-f", "/dev/null"] # Keep the container running

Save this as s3-reader-pod.yaml and apply it:

kubectl apply -f s3-reader-pod.yaml

6. Verify IRSA

Once the s3-reader-pod is running, exec into it and try to list S3 buckets.

kubectl exec -it s3-reader-pod -- bash

# Inside the pod:
aws s3 ls

If IRSA is correctly configured, you should see a list of your S3 buckets (or an empty list if you have none, but no permission denied error).

Now, try to create a bucket (which should be denied as we only granted read-only access):

Optimizing AWS EKS: Karpenter Autoscaling & IRSA Pod Identity