Securing Your Development Pipeline with AWS Managed Services

Continuous Integration versus Continuous Delivery

So to start off this discussion around securing your development pipeline, let's discuss the common terms and concepts that are often used. And despite the title, we're actually going to talk about four terms instead of two terms.

  • Continuous Integration:
    Let's summarize this one simply - the term basically means that any developers' code that is produced is synchronized on commit with a shared mainline several times a day. As part of this integration, various types of tests are likely run. A failure to pass all the tests results in the developers' commit(s) to be rejected by the mainline until the developer corrects any issues.
  • Continuous Delivery:
    This is rather, opinionated, but to me continuous delivery is ensuring that any successful integrations are able to satisfy the "customer" through early and automated verification. In some schools of thought, this is a major component of DevOps.
  • Continuous Deployment:
    Which CD are folks referring to when talking about CI/CD? From my view, continuous deployment is simply an extension of continuous delivery - the ability to continuously move the result of a development process to a production-like environment where functional testing can be executed in full scale. Note that a production-like environment, does not necessarily include production.
  • Continuous Release:
    I'm going to harp on this one a bit - generally continuous release is not something organizations or individuals want. A release or code drop is often strategic and there is no reason to assume that continuously releasing every candidate of your software is a good plan. Continuous deployment of release-candidates to production-like environments is usually preferred. There is a reason why no one really talks about CI/CD/CR. They talk about CI/CD. This is really cool when it does make sense though, as you can get feedback on new features, and defects are resolved quicker, etc.

When developing, size matters. If the developer is working on small tasks that are committed rapidly and delivered into a state where it's ready for deploy, then you're practicing continuous integration. If the tasks are larger and there are several days between the commits, then you're actually more focused on frequent integration. The exact distinction doesn't really matter - whether you're continuously integrating or frequently or even sporadically, what matters is that your development pipeline supports the work you're doing and you can deploy it when needed.

The general idea that I'm getting at is that integrating a small change in a small amount of work but integrating a large change is an astronomically huge amount of work. So if integration is done in constant small steps developers are allowed to spend more time working on business-visible features instead of the development process overhead.

I know this is probably a bit on the dry side for most of my readers - don't worry, technical content is coming! I just needed to ensure everyone is clear on what a CI/CD pipeline is, so that when we're talking about the technical content it's easy to understand where the focus is and why.

For reference, a snapshot of what AWS defines CI/CD as:

  • Continuous delivery is a software development methodology where the release process is automated. Every software change is automatically built, tested, and deployed to production. Before the final push to production, a person, an automated test, or a business rule decides when the final push should occur. Although every successful software change can be immediately released to production with continuous delivery, not all changes need to be released right away.
  • Continuous integration is a software development practice where members of a team use a version control system and frequently integrate their work to the same location, such as a master branch. Each change is built and verified to detect integration errors as quickly as possible. Continuous integration is focused on automatically building and testing code, as compared to continuous delivery, which automates the entire software release process up to production.

See in it's original location, here. The AWS definition of continuous delivery includes continuous deployment and continuous release into one concept for simplicity.

Jenkins, let's automate all the things!

With the boring conceptual components out of the way, let's focus in on everyone's favour automation tool - Jenkins. From my viewpoint, Jenkins is a very essential tool for any environment. I personally use it to automate all sorts of tasks, ranging from simple (report generation, crons, syncing) to complex (full pipelines, image building, infrastructure management). Jenkins was my CI/CD tool.

All of this meant that Jenkins had a fair amount of access to my environments. And with great power comes great responsibility, blah blah blah. But seriously, what does Jenkins care about the security of the environments I gave it access to? Not much really. Sure, Jenkins has some protections, but ultimately it could do anything to my accounts because I gave it that access.

And what did I do that was really dumb to drive this home? Well, I automated my Jenkins with CloudFormation and Salt. In my automation scripts that I had created, I made a default user for initial set up. Which I promptly forgot existed.

I'm sure you can see where this is going.

So a bit about my Jenkins set-up:

  • Jenkins' auto scaling group, launch configuration, etc, are controlled via CloudFormation.
  • Jenkins' userdata caused it to automatically bootstrap using Salt formulas, in a master-less configuration.
  • Jenkins' has an IAM role that grants it sufficient access for pretty much anything on the account it's hosted on. It also has access to other accounts that I run or have access to via cross-account roles.
  • Jenkins runs behind an internal-scheme Elastic Load Balancer.
  • Access to Jenkins is controlled via a VPN service.

Eventually, in an attempt to cut costs on my account by about $50 per month, I decide to change from an internal-scheme elastic load balancer to an internet-facing application load balancer. I then shut down the VPN service, and relied on a Web Application Firewall to handle attacks. You can see a version of my WAF tool here.

This worked well - I could see IP addresses getting blocked, and I could now easily use Jenkins from my phone without using my VPN (which had three factors for authentication). This made it a lot more convenient for me in several ways. This ran great for about six or seven weeks, until...

I got paged by my CloudWatch Alarms for bizarre api activity on one of my AWS accounts.

Honestly, this was a pretty lucky break. Without that page, my account would probably have been really badly abused.

Remember that default user I mentioned? It's username was "dev" with the password "test". Someone had presumably broken onto the box, been blocked by my WAF rules, then just moved to a new IP and logged in. By the time I was paged they had already added several tools to the instance, tried to launch instances, tried to delete all my S3 data, etc.

***slow clap***

All in all, the attacker had about 30 minutes worth of access according to the logs I have of the event. As soon as I logged into my Jenkins server and saw that it had several more jobs than when I last looked at it, I firewalled the instance, took an image of the EBS volume, and then deleted the CloudFormation template until I had time to review what had actually occurred.

Luckily for me, Jenkins didn't actually have access to do most of what the attacker wanted to do. Sure, they did try to delete all the data in my S3 buckets, but it was versioned. And in order to delete versions they would have needed the root token for my account.

If you haven't seen it yet, I recommend you take a look at my article, Advanced Auditing with AWS Config. Because my account complies with the CIS benchmarks, I ended up getting a fairly quick warning about the attacker.

Because of the attack, Jenkins now lives safely behind my VPN service again. But this lead me to give a lot more thought about what I was doing with Jenkins. And how I could minimize the risks I was exposing myself to. This lead me to investigate the AWS development tools. Which I needed to do anyways as part of the AWS Certified DevOps Engineer - Professional certification, so two birds, um, like a lot of stones? AWS has a lot of developer tools as it turns out.

Using AWS Technologies to support your overworked butler

Can you use AWS tools to make your life even easier? More secure? From my point of view, the answer is a resounding yes! But as I've recently highlighted, AWS provides a number of tools.

In the rest of this post, rather than just talk about how I screwed up with my Jenkins usage, we're going to talk about the following five AWS technologies and how they can help our development pipelines be more secure:

In addition to the services above, there is also CloudFormation, Elastic Beanstalk, OpsWorks. But now that I've listed a bunch of services, you probably want to know what to do with them. Well, here's a table matching the service to a few of the benefits:

AWS Service Description Developer / Operations / Jenkins
Lambda Run Node.js or Python code without having to manage a server or container.

Developers:

  • Run code / commands without servers.

Operations:

  • Run code in a restricted environment.
  • Isolate command privileges.

Jenkins:

  • Offload executions to Lambda.
  • Smaller permissions footprint.
CodeCommit An alternative to hosted repositories that you can access easily from an IAM role.

Developers:

  • Git interface is generally familiar.

Operations:

  • No need to hard code credentials into services.
  • No need to host repositories or use external services.

Jenkins:

  • Can access using it's IAM role over HTTPS.
CodeBuild Take your source code and build it on your docker images from ECR.

Developers:

  • Trigger code builds by submitting code.
  • Get alerts if building or testing fails.

Operations:

  • No servers to manage.
  • Can provide custom Docker images.

Jenkins:

  • Nothing to do here, unless as a trigger!
CodeDeploy Deploy software artifacts from S3 to your EC2 instances. In-place or Blue/Green supported!

Developers:

  • Can monitor deploy success / failure rates.

Operations:

  • Does not need to manage the deploy.
  • Concerned with deploy results, and gating releases.

Jenkins:

  • Can use CodeDeploy to trigger updates to instances and poll for results.
CodePipeline The glue of the development pipeline, end-to-end management of your continuous deployment solution.

Developers:

  • Can monitor the progress of their builds.

Operations:

  • Can gate builds between environments / stages.
  • Can approve releases.

Jenkins:

  • Can kick off the pipeline, and monitor / poll it for success.
  • Can be called by the pipeline for various stages of the pipeline.

This is cool and all, but how does this help my project / software? Well I think that's best explained with a small example project that ties together the different components.

An AWS Example Project

So a small example project. I'm going to work with the following requirements:

  • The project will consist of a RPM package that contains a small website.
  • We need to store the website code somewhere that developers can access.
  • We need to build the RPM package and store the artifact somewhere.
  • We need to deploy the website to a small ec2 fleet that we're running.
  • We need to be able to do A/B or Blue/Green deploys.

In case you're wondering why an RPM, I'm building an RPM because an RPM has a lot of utility.

Setting up CodeCommit

So, first part, lets create our CodeCommit repository and project structure:

  1. In the AWS Console, access the CodeCommit service.
  2. If this is you're first time, you might be presented with a "Get Started" button.
  3. Click "Create repository".
  4. Give your repository a descriptive name. In my case, I'm using justinfox-article-cicd-example.
  5. Optionally, give your repository a description. My description for this repo is just a link to this article, but a memorable description can help you in the future.
  6. Click "Create repository".
  7. Clone the repository to your local. Note: I'm not going to review how to get started with CodeCommit here. E.g.: git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/justinfox-article-cicd-example
    Cloning into 'justinfox-article-cicd-example'...
    warning: You appear to have cloned an empty repository.
  8. Next, we need to create our RPM. Since our files and code in general are fairly simple, our RPM will reflect that:
    Name          :     example-website
    Summary       :     Example Website
    Group         :     Example/Website
    Version       :     1.0.0
    Release       :     1
    License       :     GPLv3+
    

    %description The example-website package installs an example website.

    %build mkdir -p %{buildroot}/etc/nginx/conf.d/ mkdir -p %{buildroot}/var/www/example/ cp -rf %{source_repo}/webroot/* %{buildroot}/var/www/example/ cp -rf %{source_repo}/configs/etc/nginx/conf.d/* %{buildroot}/etc/nginx/conf.d/

    %clean rm -rf %{buildroot}

    %files %config(noreplace) /etc/nginx/conf.d/example.conf /var/www/example/

  9. Next we just need to add the appropriate folder structure and files. The exact content doesn’t really matter at this point.

I’ve made the code available via GitHub in case anyone wants to review!

Setting up CodeBuild

With our source code repository set up and ready to go, let’s start building our RPM.

  1. In the AWS Console, access the CodeBuild service.
  2. If this is you’re first time, you might be presented with a “Get Started” button.
  3. Click “Create project”.
  4. Give your project a name. In my case I’m going to reuse justinfox-article-cicd-example.
  5. Optionally, give your project a description. My description for this project is just a link to this article, but a memorable description can help you in the future.
  6. Choose your source provider. AWS S3, AWS CodeCommit, and GitHub are supported at the time of this writing. I’m only going to cover AWS CodeCommit here.
  7. Choose the repository you created in the previous step.
  8. For ease of use for the purposes of this article, I’m just going to use the image provided by CodeBuild.
  9. Choose Ubuntu as it’s the only supported option at this time.
  10. Choose the Base runtime.
  11. Choose the version provided (14.04 right now).
  12. For the build specification, we’ll provide one with our repository.
  13. For the artifact type, choose Amazon S3 (since we’ll be tossing it there).
  14. For the artifact name, I used justinfox-article-cicd-example.
  15. For the (s3) bucket name, I re-used a codebuild bucket I had previously.
  16. Allow CodeBuild to create and manage the IAM role for us.
  17. Expand advanced settings.
  18. Set the timeout to whatever is reasonable. It defaults to 1 hour.
  19. If desired, set the KMS key.
  20. For packaging, leave as None.
  21. Leave the compute type as the default for now.
  22. Add environment variables if you need them.
  23. Click create.

If you want you can queue up a build now, but it’s not going to do much yet. First we’ll need a buildspec.yml. Here’s the one from our repository:

version: 0.2

phases: install: commands: - apt-get update -y - apt-get install rpm -y build: commands: - ./rpm.sh - cat build.log post_build: commands: - mkdir -p RPMS/x86_64/ - cp /root/rpmbuild/RPMS/x86_64/example-website*.x86_64.rpm ./RPMS/x86_64/ artifacts: files: - ‘**/*’

This spec file is rather straightforward. We have 3 stages for the builder where we run various commands, and for our artifact, we’re just grabbing all the generated files.

Okay, lets move on to the deployment!

Setting up CodeDeploy

We now have our RPM, so now we need a mechanism to deploy it to EC2. Enter CodeDeploy! We’re going to use the sample template, but it can also be customized or you can install the agent onto your hosts. The steps:

  1. In the AWS Console, access the CodeDeploy service.
  2. If this is you’re first time, you might be presented with a “Get Started” button.
  3. Click “Sample deployment wizard”. This will do all the heavy lifting for us through use of CloudFormation.
  4. Choose “Sample deployment”.
  5. You can choose between in-place and blue/green deployment strategies next. For our purposes here, choose blue/green deployment.
  6. Provide an application name, I used Example.
  7. Provide a deployment group name, I used ExampleFleet.
  8. Provide an auto scaling group name, I used ExampleAutoScalingGroup.
  9. Provide a load balancer name, I used ExampleLoadBalancer.
  10. Provide a service role name, I used ExampleCodeDeployServiceRole.
  11. Choose a key pair name, I used an existing key pair on my account.
  12. Click “launch environment” when you’re ready. Note, it’s launching with some sample code. In a real exercise you’d want to run your own application. We’re just following this to provide a demonstration of the feature set.
  13. Once the environment is created, click start blue/green deploy.
  14. If you don’t wait long enough you’ll likely need get an error - it’s okay, just a bad GUI.
  15. Once everything is done, you’ll probably be on the deployments page.

I’m not going to outline all the steps to fine tune the deployment, for the moment it will meet our needs for performing a blue/green deployment. Elastic Beanstalk could also be used in a similar manner - maybe I’ll walkthrough that in another post!

In order to actually get your code deployed, you’ll need another file: appspec.yml

version: 0.0
os: linux
hooks:
Install:
- location: deploy.sh

It’s fairly simple, and just tells CodeDeploy where your deploy script is.

Glue it all together with CodePipeline

So we’ve committed code to CodeCommit, built it with CodeBuild, and performed a deployment with CodeDeploy. So where does CodePipeline fit in? Well, CodePipeline is the service that tracks the progress of the pipeline and triggers the various stages.

So let’s get this setup for our CI/CD alternative to Jenkins:

  1. In the AWS Console, access the CodePipeline service.
  2. If this is you’re first time, you might be presented with a “Get Started” button.
  3. Click “Create pipeline”.
  4. Choose a name for your pipeline. I used ExampleWebsite.
  5. For source provider, choose AWS CodeCommit.
  6. Select your repository from earlier.
  7. Choose the triggering branch (master).
  8. For build provider, choose AWS CodeBuild.
  9. Since we already built our CodeBuild project earlier, select it from the drop down menu.
  10. For the deploymeny provider, choose CodeDeploy. Note your other options though!
  11. Choose the application name that we set up earlier.
  12. Choose the deployment group that we set up earlier.
  13. Create a role, or use an existing IAM role.
  14. Review your settings. If you’re happy, click create pipeline.

If you want, you can edit your new pipeline and add in additional steps, stages, or required approvals.

Now let’s trigger a build by editing a file in our source code and cause it to trigger our CI/CD pipeline. You can literally make any change, as long as it is on our triggering repository and branch.

Here’s a screenshot of what our pipeline looks like:

If we wanted to, we could add in a manual approval step. It’s as easy as clicking edit, adding a stage, adding an action, selecting approval, and saving.

An alternative with Elastic Beanstalk?

With the CodeDeploy approach, you maintain several EC2 instances with the CodeDeploy agent on them. Then the agent can trigger an in-place upgrade or the service can trigger a blue / green deploy by copying the auto scaling group. Depending on what you are deploying, it might make more sense to make use of Elastic Beanstalk.

Elastic Beanstalk takes away some of the server management tasks, and has AWS do it for you. Sort of like having a micro operations team in a box or maybe a support team? With Elastic Beanstalk you get a logging dashboard, service health page, monitoring, and alerts. Elastic Beanstalk also has managed updates, so it’s easy to rollback if your initial testing was not sufficient.

Yet another alternative with CloudFormation?!

AWS Lambda has really exploded with popularity for it’s ability to run code without the need for managing servers. However it’s kind of pointless if you need to manage a build server or any servers as the glue for your Lambda function.

It turns out that you can keep your Lambda code in CodeCommit, use CodeBuild to run any required commands and create a zip file, and CloudFormation to deploy the Lambda code and any updates. All glued together with CodePipeline.

Note: you could also use CloudFormation to spin up an EBS stack or any AWS resources, even customize the launch of auto scaling groups.

Security: Before and after!

Now that I’ve covered several AWS tools, and shown how you can offload building code artifacts to AWS (using a simplified example), lets circle back to my Jenkins server. What did we remove?

  • EC2 related IAM permissions
  • Permissions needed to run Packer
  • Most of the S3 related permissions
  • CloudFormation related permissions

And of course we removed nearly all the IAM permissions it had previously.

Also, while I only briefly touch in on this in the table provided above, I’ve mitigated abuse of AWS API calls on Jenkins by restricting it to invoke:lambda and s3:getobject api calls.

Keep an eye out! If you enjoyed this post, in a future post I’ll be using AWS CodeStar and XRay in combination with several AWS services to build a small Twitter micro service. In the post I’ll be covering both developing the service and monitoring it for performance issues.