Deploying production workloads on AWS requires careful architectural planning to ensure high availability, security, and cost efficiency. In this guide, I will walk you through the complete architecture for deploying a production web application using EC2 instances behind an Application Load Balancer (ALB) with AWS WAF protection. This is the same architecture I have deployed for multiple enterprise clients handling millions of requests per day.
Architecture Overview
The architecture follows AWS Well-Architected Framework principles across all five pillars: operational excellence, security, reliability, performance efficiency, and cost optimization. Here is the high-level architecture:
┌─────────────────────────────────────────────────────────────────┐
│ INTERNET │
└─────────────────────────┬───────────────────────────────────────┘
│
┌─────────▼─────────┐
│ AWS WAF │ ← Layer 7 Protection
│ (Web ACL Rules) │ OWASP Top 10, Rate Limiting
└─────────┬─────────┘
│
┌─────────▼─────────┐
│ Application Load │ ← SSL Termination
│ Balancer (ALB) │ Health Checks, Routing
└────┬─────────┬────┘
│ │
┌───────────▼──┐ ┌──▼───────────┐
│ AZ-1 (a) │ │ AZ-2 (b) │
│ │ │ │
│ ┌──────────┐ │ │ ┌──────────┐ │
│ │ EC2 (App)│ │ │ │ EC2 (App)│ │ ← Auto Scaling Group
│ │ t3.large │ │ │ │ t3.large │ │ Min:2, Max:10
│ └──────────┘ │ │ └──────────┘ │
│ │ │ │
│ ┌──────────┐ │ │ ┌──────────┐ │
│ │ EC2 (App)│ │ │ │ EC2 (App)│ │
│ │ t3.large │ │ │ │ t3.large │ │
│ └──────────┘ │ │ └──────────┘ │
└──────┬───────┘ └──────┬───────┘
│ │
┌──────▼──────────────────▼──────┐
│ Private Subnet │
│ ┌────────────┐ ┌────────────┐ │
│ │ RDS Multi │ │ ElastiCache│ │
│ │ AZ Primary│ │ Redis │ │
│ └────────────┘ └────────────┘ │
└─────────────────────────────────┘
Component 1: AWS WAF Configuration
AWS WAF sits at the outermost layer, inspecting every HTTP/HTTPS request before it reaches your application. I configure WAF with the following rule groups for production deployments:
Managed Rules:
Web ACL Configuration: ├── AWS-AWSManagedRulesCommonRuleSet (OWASP Core) ├── AWS-AWSManagedRulesSQLiRuleSet (SQL Injection) ├── AWS-AWSManagedRulesKnownBadInputsRuleSet ├── AWS-AWSManagedRulesLinuxRuleSet ├── AWS-AWSManagedRulesAmazonIpReputationList ├── Custom Rate Limiting Rule (2000 req/5min per IP) └── Custom Geo-Blocking Rule (block high-risk countries)
The rate limiting rule is critical for preventing DDoS attacks at the application layer. I set it to 2000 requests per 5-minute window per IP address — aggressive enough to stop most automated attacks while permitting legitimate high-traffic users. The geo-blocking rule blocks traffic from countries where we have no legitimate users, reducing the attack surface significantly.
Component 2: Application Load Balancer
The ALB handles SSL termination, health checking, and intelligent traffic distribution across EC2 instances. My standard ALB configuration includes:
ALB Configuration: ├── Listener: HTTPS (443) → Target Group ├── Listener: HTTP (80) → Redirect to HTTPS ├── SSL Certificate: ACM (auto-renewed) ├── Security Policy: ELBSecurityPolicy-TLS13-1-2-2021-06 ├── Health Check: │ ├── Path: /health │ ├── Interval: 15 seconds │ ├── Healthy threshold: 2 │ ├── Unhealthy threshold: 3 │ └── Timeout: 5 seconds ├── Sticky Sessions: Disabled (stateless app) ├── Cross-Zone Load Balancing: Enabled └── Deletion Protection: Enabled
I always use TLS 1.3 as the minimum protocol version for production workloads. The health check path /health is a lightweight endpoint in the application that verifies database connectivity and returns a 200 status code only when the application is fully operational.
Component 3: EC2 Auto Scaling Group
The Auto Scaling Group ensures that the application can handle traffic spikes automatically while minimizing costs during low-traffic periods:
Auto Scaling Configuration: ├── Instance Type: t3.large (2 vCPU, 8GB RAM) ├── AMI: Custom Golden AMI (pre-configured) ├── Min Instances: 2 (one per AZ) ├── Max Instances: 10 ├── Desired: 4 ├── Scaling Policies: │ ├── Scale Out: CPU > 70% for 3 minutes → Add 2 instances │ ├── Scale In: CPU < 30% for 10 minutes → Remove 1 instance │ └── Target Tracking: ALBRequestCountPerTarget = 1000 ├── Health Check Type: ELB (not EC2) ├── Cooldown: 300 seconds └── Instance Refresh: Rolling update, 25% at a time
Component 4: Security Groups (Network Firewall)
Security Group Chain:
┌─────────────────────────────────────────┐
│ SG-ALB (Load Balancer) │
│ Inbound: 443 from 0.0.0.0/0 │
│ Inbound: 80 from 0.0.0.0/0 │
│ Outbound: All to SG-App │
└─────────────────────────────────────────┘
│
┌────────▼────────────────────────────────┐
│ SG-App (EC2 Instances) │
│ Inbound: 8080 from SG-ALB ONLY │
│ Inbound: 22 from Bastion SG ONLY │
│ Outbound: 443 to 0.0.0.0/0 (APIs) │
│ Outbound: 5432 to SG-DB │
│ Outbound: 6379 to SG-Cache │
└─────────────────────────────────────────┘
│
┌────────▼────────────────────────────────┐
│ SG-DB (RDS Database) │
│ Inbound: 5432 from SG-App ONLY │
│ Outbound: None │
└─────────────────────────────────────────┘
The key principle is that no security group allows direct internet access to application or database instances. Traffic must flow through the ALB, which is the only component with a public-facing security group. This chain of security group references creates a strict traffic flow that prevents any tier from being accessed directly.
Deployment Steps
Step 1: Create VPC with 2 public subnets (for ALB) and 2 private subnets (for EC2 and RDS) across 2 AZs.
Step 2: Create security groups with the chain described above.
Step 3: Launch RDS Multi-AZ in private subnets with encryption enabled.
Step 4: Create a Launch Template with your application AMI, user data script, and IAM instance profile.
Step 5: Create the Auto Scaling Group with the Launch Template, targeting private subnets.
Step 6: Create the ALB in public subnets, create target group, and register the ASG.
Step 7: Create WAF Web ACL with managed rules and associate it with the ALB.
Step 8: Configure Route 53 with an alias record pointing to the ALB.
Cost Estimate
Monthly Cost (Mumbai Region, ap-south-1): ├── EC2 (4x t3.large, On-Demand): $240 ├── ALB (fixed + LCU): $25 ├── WAF (Web ACL + rules + requests): $15 ├── RDS (db.t3.large, Multi-AZ): $180 ├── ElastiCache (cache.t3.medium): $45 ├── Data Transfer (500GB): $45 ├── CloudWatch + Logs: $20 └── Total: ~$570/month
With Reserved Instances for EC2 and RDS, this can be reduced to approximately $380/month — a 33% saving for a 1-year commitment.
Monitoring and Alerting
I configure CloudWatch alarms for: ALB 5xx error rate > 1%, EC2 CPU > 80% sustained, RDS connections > 80% of max, WAF blocked requests spike > 10x normal, and disk usage > 85%. All alerts go to a PagerDuty integration for immediate response.
