Overview
In today's interconnected digital landscape, APIs are the backbone of modern applications, facilitating data exchange between services, mobile apps, and third-party integrations. While APIs drive innovation and efficiency, they also present a significant attack surface for malicious actors. From brute-force login attempts and credential stuffing to volumetric DDoS attacks and sophisticated bot activity, unprotected APIs can lead to data breaches, service disruptions, and reputational damage.
AWS WAF (Web Application Firewall) v2 is a robust, cloud-native security service that helps protect web applications and APIs from common web exploits and automated bots that may affect availability, compromise security, or consume excessive resources. Operating at Layer 7 of the OSI model, WAF inspects incoming HTTP/S requests, allowing you to define custom rules to filter out unwanted traffic based on various criteria such as IP addresses, HTTP headers, URI strings, query parameters, and request body content.
This article delves into the critical capabilities of AWS WAF v2 for API protection, specifically focusing on rate limiting and advanced bot control. We'll explore how these features can be configured to defend your APIs hosted on Amazon API Gateway, Application Load Balancers (ALB), or Amazon CloudFront distributions, ensuring your services remain secure and performant against a constantly evolving threat landscape. By implementing a strategic WAF configuration, you can significantly reduce the risk of common API-specific attacks, safeguard your data, and maintain operational integrity.
Prerequisites
Before we dive into the implementation, ensure you have the following in place:
- AWS Account: An active AWS account with administrative access or IAM permissions sufficient to create and manage AWS WAFv2, Amazon API Gateway, and CloudWatch resources.
- AWS CLI: The AWS Command Line Interface (CLI) installed and configured with appropriate credentials. We will be using CLI commands extensively. Ensure your AWS CLI is updated to a recent version (v2 is recommended).
- Jq (Optional but Recommended): A lightweight and flexible command-line JSON processor. It will be used to parse JSON output from AWS CLI commands, making it easier to extract ARNs and IDs.
- Existing API Gateway Endpoint: For this guide, we will primarily focus on protecting an API Gateway endpoint. You should have an existing REST API and at least one stage (e.g., `prod`, `dev`) deployed.
- Basic WAF Knowledge: A fundamental understanding of AWS WAF concepts, including Web ACLs, rules, rule groups, and actions (Allow, Block, Count).
To ensure your AWS CLI is configured correctly, run a simple command:
aws sts get-caller-identity
This command should return details about the IAM entity making the call, confirming your CLI setup.
Step-by-step implementation
This section will guide you through the process of setting up AWS WAF v2 for API protection, focusing on rate limiting and bot control. We'll assume you have an existing API Gateway REST API with a deployed stage that you wish to protect.
1. Define the Scope: API Gateway Integration
AWS WAF v2 can protect resources at two scopes: REGIONAL (for Application Load Balancers, API Gateway, and AppSync) and CLOUDFRONT (for Amazon CloudFront distributions). For API Gateway, we will use the REGIONAL scope. First, we need to identify the ARN of the API Gateway stage we intend to protect.
Let's list your existing API Gateways and their stages:
# List all REST APIs
aws apigateway get-rest-apis
# Example output (trimmed for brevity):
# {
# "items": [
# {
# "id": "abcdefg123",
# "name": "MySecureAPI",
# "description": "API for secure operations",
# "createdDate": 1678886400,
# "apiKeySource": "HEADER",
# "endpointConfiguration": {
# "types": [
# "REGIONAL"
# ]
# }
# }
# ]
# }
# Assuming "MySecureAPI" has ID "abcdefg123", list its stages
API_ID="abcdefg123"
aws apigateway get-stages --rest-api-id "${API_ID}"
# Example output (trimmed for brevity):
# {
# "item": [
# {
# "deploymentId": "xyz789",
# "stageName": "prod",
# "description": "Production stage",
# "cacheClusterEnabled": false,
# "cacheClusterStatus": "NOT_AVAILABLE",
# "methodSettings": {},
# "variables": {},
# "createdDate": 1678886400,
# "lastUpdatedDate": 1678886400,
# "webAclArn": null,
# "tags": {}
# }
# ]
# }
From the output, identify your desired API ID and stage name. For instance, if your API ID is `abcdefg123` and the stage name is `prod`, your API Gateway Stage ARN will follow this format:
API_GATEWAY_STAGE_ARN="arn:aws:execute-api:us-east-1:123456789012:/${API_ID}/prod"
echo "API Gateway Stage ARN: ${API_GATEWAY_STAGE_ARN}"
Replace `us-east-1` with your AWS region and `123456789012` with your AWS account ID. You can get your account ID using `aws sts get-caller-identity --query Account --output text`.
2. Create a Web ACL
A Web ACL (Access Control List) is the primary resource in AWS WAF. It contains a set of rules that AWS WAF evaluates against web requests. Let's create a new Web ACL for our API protection.
WEB_ACL_NAME="ApiProtectionWebACL"
REGION="us-east-1" # Replace with your desired region
CREATE_WEB_ACL_OUTPUT=$(aws wafv2 create-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--default-action Allow \
--description "Web ACL for API Protection with Rate Limiting and Bot Control" \
--tags Key=Project,Value=TechNewsVenture Key=Environment,Value=Production \
--region "${REGION}")
WEB_ACL_ARN=$(echo "${CREATE_WEB_ACL_OUTPUT}" | jq -r '.Summary.ARN')
WEB_ACL_ID=$(echo "${CREATE_WEB_ACL_OUTPUT}" | jq -r '.Summary.Id')
WEB_ACL_LOCKTOKEN=$(echo "${CREATE_WEB_ACL_OUTPUT}" | jq -r '.Summary.LockToken')
echo "Web ACL ARN: ${WEB_ACL_ARN}"
echo "Web ACL ID: ${WEB_ACL_ID}"
echo "Web ACL LockToken: ${WEB_ACL_LOCKTOKEN}"
We set the `default-action` to `Allow` initially. This means that if a request doesn't match any explicit `Block` rule, it will be allowed. This is a safe starting point. As we add rules, we will use the `WEB_ACL_LOCKTOKEN` to ensure atomic updates.
3. Implement Rate Limiting Rules
Rate limiting is crucial for API protection, preventing a single IP address (or other aggregate key) from making an excessive number of requests within a short period. This protects against brute-force attacks, DDoS attempts, and resource exhaustion.
Rule 1: Generic IP-based Rate Limiting
This rule blocks any single IP address that makes more than a specified number of requests (e.g., 200) within a 5-minute sliding window.
# Fetch the current Web ACL configuration to get the latest LockToken
CURRENT_WEB_ACL_CONFIG=$(aws wafv2 get-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--region "${REGION}")
WEB_ACL_LOCKTOKEN=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -r '.WebACL.LockToken')
aws wafv2 update-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--lock-token "${WEB_ACL_LOCKTOKEN}" \
--default-action Allow \
--description "Web ACL for API Protection with Rate Limiting and Bot Control" \
--rules '[
{
"Name": "GenericIPRateLimit",
"Priority": 10,
"Action": { "Block": {} },
"Statement": {
"RateBasedStatement": {
"Limit": 200,
"AggregateKeyType": "IP"
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "GenericIPRateLimit"
}
}
]' \
--region "${REGION}"
In this rule:
Name: A unique name for the rule.Priority: Determines the order in which rules are evaluated (lower number means higher priority).Action: We set it toBlock, meaning requests exceeding the limit will be blocked. You could useCountfor testing.RateBasedStatement: The core of the rate limiting.Limit: The maximum number of requests allowed (200 in this case).AggregateKeyType: Specifies what to aggregate requests by.IPaggregates by the source IP address. Other options likeFORWARDED_IPorCUSTOM_KEYSare available for more advanced scenarios (e.g., behind a CDN).
VisibilityConfig: Enables CloudWatch metrics and sampled requests for monitoring.
Rule 2: Path-Specific Rate Limiting (e.g., for /login endpoint)
Some API endpoints, like login or signup, are more sensitive to abuse. You might want to apply a stricter rate limit to these specific paths.
# Fetch the current Web ACL configuration to get the latest LockToken
CURRENT_WEB_ACL_CONFIG=$(aws wafv2 get-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--region "${REGION}")
WEB_ACL_LOCKTOKEN=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -r '.WebACL.LockToken')
CURRENT_RULES=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -c '.WebACL.Rules')
# Append the new rule to the existing rules
UPDATED_RULES=$(echo "${CURRENT_RULES}" | jq -c '. + [
{
"Name": "LoginPathRateLimit",
"Priority": 20,
"Action": { "Block": {} },
"Statement": {
"AndStatement": {
"Statements": [
{
"RateBasedStatement": {
"Limit": 20,
"AggregateKeyType": "IP"
}
},
{
"ByteMatchStatement": {
"SearchString": "/login",
"FieldToMatch": { "UriPath": {} },
"TextTransformations": [
{ "Priority": 0, "Type": "LOWERCASE" }
],
"PositionalConstraint": "STARTS_WITH"
}
}
]
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "LoginPathRateLimit"
}
}
]')
aws wafv2 update-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--lock-token "${WEB_ACL_LOCKTOKEN}" \
--default-action Allow \
--description "Web ACL for API Protection with Rate Limiting and Bot Control" \
--rules "${UPDATED_RULES}" \
--region "${REGION}"
Here, we use an AndStatement to combine two conditions: the rate limit and a URI path match. The ByteMatchStatement checks if the URI path starts with `/login`. The TextTransformations ensures the match is case-insensitive. This rule will block an IP if it makes more than 20 requests to `/login` within 5 minutes.
4. Implement Bot Control Rules
Automated bots can range from benign web crawlers to malicious scanners, scrapers, and credential stuffers. AWS WAF Bot Control provides managed rule groups to detect and mitigate sophisticated bot traffic without impacting legitimate users.
Rule 1: AWS Managed Rule Group for Bot Control
This managed rule group provides a comprehensive set of rules curated by AWS to identify and mitigate common bot categories. It includes rules for various bot types, like scrapers, scanners, and spammers, and can differentiate between "Good Bots" (e.g., search engine crawlers) and "Bad Bots."
# Fetch the current Web ACL configuration to get the latest LockToken
CURRENT_WEB_ACL_CONFIG=$(aws wafv2 get-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--region "${REGION}")
WEB_ACL_LOCKTOKEN=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -r '.WebACL.LockToken')
CURRENT_RULES=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -c '.WebACL.Rules')
# Append the new rule to the existing rules
UPDATED_RULES=$(echo "${CURRENT_RULES}" | jq -c '. + [
{
"Name": "AWSManagedBotControl",
"Priority": 30,
"Statement": {
"ManagedRuleGroupStatement": {
"VendorName": "AWS",
"Name": "AWSManagedRulesBotControlRuleSet",
"Version": "1.0",
"ScopeDownStatement": { "All": {} },
"ManagedRuleGroupConfigs": [
{
"LoginPath": "/login",
"TargetResourceType": "API_GATEWAY"
}
],
"ExcludedRules": [
# Example: If you want to exclude a specific rule from the managed rule group
# { "Name": "CategoryScanners" }
]
}
},
"OverrideAction": { "None": {} }, # Use default action of the managed rule group
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "AWSManagedBotControl"
}
}
]')
aws wafv2 update-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--lock-token "${WEB_ACL_LOCKTOKEN}" \
--default-action Allow \
--description "Web ACL for API Protection with Rate Limiting and Bot Control" \
--rules "${UPDATED_RULES}" \
--region "${REGION}"
Key aspects of this rule:
ManagedRuleGroupStatement: References a rule group managed by AWS.VendorNameandName: Identify the specific managed rule group.Version: Pinning to a specific version (e.g., "1.0") is good practice for consistency.ScopeDownStatement: Allows you to apply the managed rule group only to requests matching certain criteria."All": {}means it applies to all requests.ManagedRuleGroupConfigs: This is crucial for Bot Control.LoginPath: Helps the Bot Control rule group understand your login endpoint for enhanced credential stuffing protection.TargetResourceType: Specifies the type of resource being protected, which helps the rule group optimize its detection.
OverrideAction:"None": {}means the rule group's default actions (Block, Count, Allow) will be used. You can override this to `Count` for testing.
Rule 2: Custom Bot Signature (e.g., specific User-Agent)
While managed rule groups are powerful, you might encounter specific bot patterns unique to your application that require custom rules. Here's an example of blocking requests with a known malicious User-Agent string.
# Fetch the current Web ACL configuration to get the latest LockToken
CURRENT_WEB_ACL_CONFIG=$(aws wafv2 get-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--region "${REGION}")
WEB_ACL_LOCKTOKEN=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -r '.WebACL.LockToken')
CURRENT_RULES=$(echo "${CURRENT_WEB_ACL_CONFIG}" | jq -c '.WebACL.Rules')
# Append the new rule to the existing rules
UPDATED_RULES=$(echo "${CURRENT_RULES}" | jq -c '. + [
{
"Name": "BlockBadUserAgent",
"Priority": 40,
"Action": { "Block": {} },
"Statement": {
"ByteMatchStatement": {
"SearchString": "BadBotCrawler",
"FieldToMatch": {
"SingleHeader": { "Name": "User-Agent" }
},
"TextTransformations": [
{ "Priority": 0, "Type": "LOWERCASE" }
],
"PositionalConstraint": "CONTAINS"
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "BlockBadUserAgent"
}
}
]')
aws wafv2 update-web-acl \
--name "${WEB_ACL_NAME}" \
--scope REGIONAL \
--id "${WEB_ACL_ID}" \
--lock-token "${WEB_ACL_LOCKTOKEN}" \
--default-action Allow \
--description "Web ACL for API Protection with Rate Limiting and Bot Control" \
--rules "${UPDATED_RULES}" \
--region "${REGION}"
This rule uses a ByteMatchStatement to inspect the User-Agent header. If it contains "BadBotCrawler" (case-insensitive due to `LOWERCASE` transformation), the request is blocked. You would replace "BadBotCrawler" with actual User-Agent strings observed in malicious traffic targeting your API.
5. Associate Web ACL with API Gateway Stage
Once your Web ACL is configured with rules, the final step is to associate it with your API Gateway stage. This tells API Gateway to send incoming requests through the WAF for inspection.
# Ensure API_GATEWAY_STAGE_ARN and WEB_ACL_ARN are set from previous steps
# Example:
# API_GATEWAY_STAGE_ARN="arn:aws:execute-api:us-east-1:123456789012:/abcdefg123/prod"
# WEB_ACL_ARN="arn:aws:wafv2:us-east-1:123456789012:regional/webacl/ApiProtectionWebACL/a1b2c3d4-e5f6-7890-1234-567890abcdef"
aws wafv2 associate-web-acl \
--web-acl-arn "${WEB_ACL_ARN}" \
--resource-arn "${API_GATEWAY_STAGE_ARN}" \
--region "${REGION}"
echo "Web ACL ${WEB_ACL_NAME} associated with API Gateway Stage: ${API_GATEWAY_STAGE_ARN}"
To verify the association, you can list the resources associated with your Web ACL:
aws wafv2 list-resources-for-web-acl \
--web-acl-arn "${WEB_ACL_ARN}" \
--resource-type API_GATEWAY \
--region "${REGION}"
The output should include your API Gateway Stage ARN.
6. Monitoring and Logging
Effective WAF deployment requires continuous monitoring. AWS WAF provides integration with Amazon CloudWatch for metrics and Amazon Kinesis Data Firehose (which can then deliver to S3, CloudWatch Logs, or Splunk) for detailed request logs.
Enable WAF Logging to CloudWatch Logs
First, create a CloudWatch Log Group for your WAF logs:
LOG_GROUP_NAME="/aws/waf/ApiProtectionWebACL"
aws logs create-log-group \
--log-group-name "${LOG_GROUP_NAME}" \
--region "${REGION}"
LOG_GROUP_ARN=$(aws logs describe-log-groups \
--log-group-name-prefix "${LOG_GROUP_NAME}" \
--region "${REGION}" \
| jq -r '.logGroups[0].arn')
echo "CloudWatch Log Group ARN: ${LOG_GROUP_ARN}"
Now, enable WAF logging to this CloudWatch Log Group. WAF requires a Kinesis Data Firehose delivery stream to send logs to CloudWatch Logs. So, we need to create a Firehose delivery stream configured to send to CloudWatch Logs.
First, create an IAM role for Kinesis Firehose to publish logs:
FIREHOSE_ROLE_NAME="WafFirehoseRole"
TRUST_POLICY='{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}'
FIREHOSE_ROLE_ARN=$(aws iam create-role \
--role-name "${FIREHOSE_ROLE_NAME}" \
--assume-role-policy-document "${TRUST_POLICY}" \
--query 'Role.Arn' --output text)
echo "Firehose Role ARN: ${FIREHOSE_ROLE_ARN}"
# Attach a policy that allows Firehose to put logs to CloudWatch Logs
FIREHOSE_POLICY_ARN=$(aws iam create-policy \
--policy-name "WafFirehoseCloudWatchLogsPolicy" \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:'"${REGION}"':'"$(aws sts get-caller-identity --query Account --output text)"':log-group:'"${LOG_GROUP_NAME}"':*"
]
}
]
}' \
--query 'Policy.Arn' --output text)
aws iam attach-role-policy \
--role-name "${FIREHOSE_ROLE_NAME}" \
--policy-arn "${FIREHOSE_POLICY_ARN}"
# Give some time for IAM changes to propagate
sleep 10
Now, create the Kinesis Firehose delivery stream:
FIREHOSE_STREAM_NAME="aws-waf-logs-to-cloudwatch"
CREATE_FIREHOSE_OUTPUT=$(aws firehose create-delivery-stream \
--delivery-stream-name "${FIREHOSE_STREAM_NAME}" \
--delivery-stream-type DirectPut \
--cloud-watch-logging-options "Enabled=true,LogGroupName=${LOG_GROUP_NAME},LogStreamName=FirehoseDelivery" \
--destination-configuration '{
"CloudWatchLogsDestinationConfiguration": {
"LogGroupName": "'"${LOG_GROUP_NAME}"'",
"RoleARN": "'"${FIREHOSE_ROLE_ARN}"'",
"BufferingHints": {
"IntervalInSeconds": 60,
"SizeInMBs":