This article addresses how to protect your CloudFront distributed website, whether backed by S3 or EC2, with Amazon's Web Application Firewall and Lambda. We'll address how to automatically detect unwanted traffic based on request rate and then update our WAF configuration to block new requests from those users for a duration of time.
You'll need a CloudFront distribution in order to attempt the steps discussed here. I have several articles on CloudFront available if you need help to get started.
As an observation, I think it's kind of sad that the posted tutorials in the documentation don't take you through the step by step process and instead just provide a CloudFormation template.
You can see the official AWS example here.
Changes to your CloudFront Distribution
We're going to need to enable CloudFront logging to log access requests.
- In the CloudFront console, select a distribution.
- On the general tab, click edit.
- Edit bucket logging, and select a bucket.
- Optionally, enable a log prefix (recommended).
- Optionally, enable cookie logging.
- Click "Yes, Edit".
After waiting a bit you should see log files start to appear in your S3 bucket.
Create a WAF ACL
</p> <ol> <li> Open the WAF console. </li> <li> If you've never used WAF before, click get started. Otherwise, click create. </li> <li> If you want, review the WAF concepts blurb. I didn't find it helpful personally, but maybe you will! </li> <li> Name your web ACL and the corresponding CloudWatch metric as desired. Note that you need to delete and remake the ACL if you want to change the name. </li> <li> Click next. </li> <li> Select the type of conditions you want, in this case we're working an "IP match conditions" condition. Click "create IP match condition". </li> <li> Give your condition a name, and optionally add a selection of IP ranges. Click create when done. </li> <li> Click next. </li> <li> Click create rule. </li> <li> Give the rule a name and CloudWatch metric name. </li> <li> Add your previous created condition by setting the condition to "does", "originate from an IP address in", and your created condition. </li> <li> Click create. </li> <li> Set the default rule to allow. </li> <li> Click next. </li> <li> Select your CloudFront distribution. </li> <li> Click review and create. </li> <li> Click confirm and create. </li> </ol> <p> Quick side tangent here, if you've been following my hosting websites with S3 and CloudFront, this probably isn't for you. WAF probably costs more for one month than my hosting of this blog does for a year. </p> <h3>Use Lambda to update the WAF ACL</h3> <blockquote> <p> Wait, what? I thought we were going to talk about rate limiting with WAF? I don't want to do the Lambda! </p> <footer>Someone, somewhere.</footer> </blockquote> <p> I hear you, I really do. When I first tried this the whole WAF console confused me. Then I realized all the real functionality comes from chaining different AWS services together. So WAF does rules and conditions, it doesn't just "do the thing" on it's own. So what this actually looks like: </p> <ol> <li> Requests hit CloudFront which is protected by our WAF rules. </li> <li> CloudFront relays the request to our origin, and logs the requests to S3. </li> <li> When S3 receives a log file, it triggers a Lambda function. </li> <li> Lambda runs some logic on the log file, based on your specific wants and needs. </li> <li> Based on your logic, Lambda updates the WAF rule by adding or removing IP blocks. </li> </ol> <p> We're going to walk through a custom example, step by step. However you can (and probably should) check out the the official AWS example <a href="https://github.com/awslabs/aws-waf-sample" target="_blank">here</a>. </p> <p> The steps to configure our Lambda function are fairly simple: </p> <ol> <li> Open the Lambda console. </li> <li> Click "Create a Lambda function". </li> <li> Scroll past the examples, and click skip. </li> <li> Provide a name and (optionally) a description for the function. </li> <li> Select a runtime. In this example we're using Python. (below) </li> <li> Ensure "edit code inline" is selected. </li> <li> Add our completed Lambda code to the text box. </li> <li> Ensure the handler is set to our handler function. </li> <li> Set the IAM role as appropriate. </li> <li> Leave the memory at 128 MB. </li> <li> Leave the timeout at 0 min 3 secs. </li> <li> We don't need to use a VPC for our example, but you could optionally set one. </li> <li> Click next. </li> <li> Review your function and click "Create function". </li> <li> On the Lambda functions screen, find the function we just created. </li> <li> Click the function name to enter the function to edit it. </li> <li> Click on the event sources tab. </li> <li> Click "Add event source". Note, you could also do the next steps by configuring the S3 bucket from earlier. </li> <li> Select the event source type, and set it to S3. </li> <li> Select the bucket to monitor for events. In this case it should be the CloudFront logs bucket we used earlier. </li> <li> For the event type, select "Object Created (All). </li> <li> If desired, set a S3 path prefix for your bucket objects. This is useful to limit the number of objects that can trigger an event. Eg. "/websitename/". </li> <li> If desired, set a S3 path suffix for your bucket objects. This is useful to limit the number of objects that can trigger an event. Eg. ".gz". </li> <li> Click "Submit". </li> </ol> <p> And with that our function will process the logs and automatically update our firewall rules. </p> <p> Here's an example snippet for a Lambda-based parser: </p> <p> <code>waf-parser.py</code>: </p> <pre>def lambda_handler(event, context): # Print debugging info. print("[lambda_handler] :: Initializing lambda function.") print("[lambda_handler] :: Received event: " + json.dumps(event)) object = json.loads(json.dumps(event)) # Get the S3 info needed for the log file. s3_bucket_name = object['Records']['s3']['bucket']['name'] s3_object_name = object['Records']['s3']['object']['key'] # Get and decode the log file logs = get_logfile(s3_bucket_name, s3_object_name) # Get the WAF details waf_acl_details, waf_rule_details, waf_condition_details, waf_ips = get_waf(WAF_ACL_NAME, WAF_RULE_NAME, WAF_CONDITION_NAME) # Create the ip lists needed to update the WAF block_list, remove_list = create_ip_list(logs, waf_ips) # Update WAF rules waf_status = update_waf(waf_condition_details, block_list, remove_list) print("[lambda_handler] :: Exiting lambda function now.")</pre> <p> You can view the full Lambda function <a href="https://github.com/666jfox777/aws-waf-rate-limiting-basic" target="_blank">here</a>. </p> <p> The Lambda function is what really provides a lot of benefit, and can do a lot of custom logic against any log files that CloudFront provides. </p>