Protecting your website from automated attacks with WAF

This article addresses how to protect your CloudFront distributed website, whether backed by S3 or EC2, with Amazon's Web Application Firewall and Lambda. We'll address how to automatically detect unwanted traffic based on request rate and then update our WAF configuration to block new requests from those users for a duration of time.

You'll need a CloudFront distribution in order to attempt the steps discussed here. I have several articles on CloudFront available if you need help to get started.

As an observation, I think it's kind of sad that the posted tutorials in the documentation don't take you through the step by step process and instead just provide a CloudFormation template.

You can see the official AWS example here.

Changes to your CloudFront Distribution

We're going to need to enable CloudFront logging to log access requests.

  1. In the CloudFront console, select a distribution.
  2. On the general tab, click edit.
  3. Edit bucket logging, and select a bucket.
  4. Optionally, enable a log prefix (recommended).
  5. Optionally, enable cookie logging.
  6. Click "Yes, Edit".

After waiting a bit you should see log files start to appear in your S3 bucket.

Create a WAF ACL

</p>
<ol>
    <li>
        Open the WAF console.
    </li>
    <li>
        If you've never used WAF before, click get started. Otherwise, click
        create.
    </li>
    <li>
        If you want, review the WAF concepts blurb. I didn't find it helpful
        personally, but maybe you will!
    </li>
    <li>
        Name your web ACL and the corresponding CloudWatch metric as
        desired. Note that you need to delete and remake the ACL if you want
        to change the name.
    </li>
    <li>
        Click next.
    </li>
    <li>
        Select the type of conditions you want, in this case we're working
        an "IP match conditions" condition. Click "create IP match
        condition".
    </li>
    <li>
        Give your condition a name, and optionally add a selection of IP
        ranges. Click create when done.
    </li>
    <li>
        Click next.
    </li>
    <li>
        Click create rule.
    </li>
    <li>
        Give the rule a name and CloudWatch metric name.
    </li>
    <li>
        Add your previous created condition by setting the condition to
        "does", "originate from an IP address in", and your created
        condition.
    </li>
    <li>
        Click create.
    </li>
    <li>
        Set the default rule to allow.
    </li>
    <li>
        Click next.
    </li>
    <li>
        Select your CloudFront distribution.
    </li>
    <li>
        Click review and create.
    </li>
    <li>
        Click confirm and create.
    </li>
</ol>
<p>
    Quick side tangent here, if you've been following my hosting websites with S3
    and CloudFront, this probably isn't for you. WAF probably costs more for
    one month than my hosting of this blog does for a year.
</p>
<h3>Use Lambda to update the WAF ACL</h3>
<blockquote>
    <p>
        Wait, what? I thought we were going to talk about rate limiting with
        WAF? I don't want to do the Lambda!
    </p>
    <footer>Someone, somewhere.</footer>
</blockquote>
<p>
    I hear you, I really do. When I first tried this the whole WAF console
    confused me. Then I realized all the real functionality comes from
    chaining different AWS services together. So WAF does rules and conditions, it
    doesn't just "do the thing" on it's own. So what this actually looks
    like:
</p>
<ol>
    <li>
        Requests hit CloudFront which is protected by our WAF rules.
    </li>
    <li>
        CloudFront relays the request to our origin, and logs the requests
        to S3.
    </li>
    <li>
        When S3 receives a log file, it triggers a Lambda function.
    </li>
    <li>
        Lambda runs some logic on the log file, based on your specific wants
        and needs.
    </li>
    <li>
        Based on your logic, Lambda updates the WAF rule by adding or
        removing IP blocks.
    </li>
</ol>
<p>
    We're going to walk through a custom example, step by step. However you
    can (and probably should) check out the the official AWS example
    <a href="https://github.com/awslabs/aws-waf-sample" target="_blank">here</a>.
</p>
<p>
    The steps to configure our Lambda function are fairly simple:
</p>
<ol>
    <li>
        Open the Lambda console.
    </li>
    <li>
        Click "Create a Lambda function".
    </li>
    <li>
        Scroll past the examples, and click skip.
    </li>
    <li>
        Provide a name and (optionally) a description for the function.
    </li>
    <li>
        Select a runtime. In this example we're using Python. (below)
    </li>
    <li>
        Ensure "edit code inline" is selected.
    </li>
    <li>
        Add our completed Lambda code to the text box.
    </li>
    <li>
        Ensure the handler is set to our handler function.
    </li>
    <li>
        Set the IAM role as appropriate.
    </li>
    <li>
        Leave the memory at 128 MB.
    </li>
    <li>
        Leave the timeout at 0 min 3 secs.
    </li>
    <li>
        We don't need to use a VPC for our example, but you could optionally
        set one.
    </li>
    <li>
        Click next.
    </li>
    <li>
        Review your function and click "Create function".
    </li>
    <li>
        On the Lambda functions screen, find the function we just created.
    </li>
    <li>
        Click the function name to enter the function to edit it.
    </li>
    <li>
        Click on the event sources tab.
    </li>
    <li>
        Click "Add event source". Note, you could also do the next steps by
        configuring the S3 bucket from earlier.
    </li>
    <li>
        Select the event source type, and set it to S3.
    </li>
    <li>
        Select the bucket to monitor for events. In this case it should be
        the CloudFront logs bucket we used earlier.
    </li>
    <li>
        For the event type, select "Object Created (All).
    </li>
    <li>
        If desired, set a S3 path prefix for your bucket objects. This is
        useful to limit the number of objects that can trigger an event.
        Eg. "/websitename/".
    </li>
    <li>
        If desired, set a S3 path suffix for your bucket objects. This is
        useful to limit the number of objects that can trigger an event.
        Eg. ".gz".
    </li>
    <li>
        Click "Submit".
    </li>
</ol>
<p>
    And with that our function will process the logs and automatically
    update our firewall rules.
</p>
<p>
    Here's an example snippet for a Lambda-based parser:
</p>
<p>
    <code>waf-parser.py</code>:
</p>
<pre>def lambda_handler(event, context):
# Print debugging info.
print("[lambda_handler] :: Initializing lambda function.")
print("[lambda_handler] :: Received event: " + json.dumps(event))

object = json.loads(json.dumps(event))

# Get the S3 info needed for the log file.
s3_bucket_name = object['Records'][0]['s3']['bucket']['name']
s3_object_name = object['Records'][0]['s3']['object']['key']

# Get and decode the log file
logs = get_logfile(s3_bucket_name, s3_object_name)

# Get the WAF details
waf_acl_details, waf_rule_details, waf_condition_details, waf_ips = get_waf(WAF_ACL_NAME, WAF_RULE_NAME, WAF_CONDITION_NAME)

# Create the ip lists needed to update the WAF
block_list, remove_list = create_ip_list(logs, waf_ips)

# Update WAF rules
waf_status = update_waf(waf_condition_details, block_list, remove_list)
print("[lambda_handler] :: Exiting lambda function now.")</pre>
<p>
    You can view the full Lambda function
    <a href="https://github.com/666jfox777/aws-waf-rate-limiting-basic" target="_blank">here</a>.
</p>
<p>
    The Lambda function is what really provides a lot of benefit, and can do
    a lot of custom logic against any log files that CloudFront provides.
</p>