Browsed by
Category: Uncategorized

Problems with Terraform

Problems with Terraform

In my last post, I discussed using Terraform to build out the base components of an AWS environment. While running this code to build out the base environment has worked the way I intended, I have run into some pretty major issues with building out the next layer, which consist of a group of private subnets.

I ran into two key problems that I haven’t been able to solve. The first is around passing the counts from one environment to the next. In my base environment I set them as outputs and then import the state file as a data source, but when I try to use it, I get the error “value of count cannot be computed.”

The second issue is a little more complicated, but it comes down to setting variables in the module section of the main.tf file when the data doesn’t exist in base statefile. Essentially, if I don’t create a second nat gateway in the base setup, the no output shows up in the statefile. When I run the second set of Terraform scripts, I would like it to ignore or default if it doesn’t exist, rather than error.

At this point, I am pretty frustrated with it. I have decided that I am going to circle back and take another look at CloudFormation now that they have support for YAML and cross-stack variable and see if I can do everything that I want to do. I’ll post details later this week.

Terraform to Buildout AWS

Terraform to Buildout AWS

I started playing with Terraform a few months ago when I needed to spin up a prototype environment to do an evaluation of the open source version of Cloud Foundry. One of the results was some Terraform code that could bring up the essentials of an AWS VPC, which included the VPC itself, three public subnets, three NATs, three Elastic IPs (EIPs), and a Route53 hosted zone. While it might seem like overkill to use this many Availability Zones (AZs) for a prototype environment, once of the things we needed to test was how Cloud Foundry’s new multi-az support worked.

This was good for what I was working on at the time since I needed to test across muliple AZs, but it was problematic for most of the day-to day testing that I need to do as it would spin up (and charge me for) components that I didn’t really need. Most things I have to test don’t require three of everything. The challenge was, I do not want to maintain different code repositories for different use cases.

Luckily, I came across an article by Yevgeniy Brikman that had some interesting tips and tricks on how to do some loops and conditionals. The most interesting bit for the problem I am trying to solve was learning that in Terraform, a boolean true is converted to a 1 and a boolean false is converted to a 0. Yevgeniy then used an example that I proceed to incorporate into my code. Essentially, what I did was create three new variables in my environments file to define whether or not I wanted to create each of the three public subnets:

# public-subnets module
public-1a_create = true
public-1b_create = false
public-1c_create = false

Then for each resource, I added the count variable:

resource "aws_subnet" "public-1a" {
    count                   = "${var.public-1a_create}"
    vpc_id                  = "${var.vpc_id}"
    cidr_block              = "${var.public-1a_subnet_cidr}"
    map_public_ip_on_launch = true
    availability_zone       = "us-east-1a"
    tags {
        Name                  = "public-1a"
        Description           = "Public Subnet us-east-1a"
        Terraform             = "true"
    }
}

Now I am able to spin up an AWS environment that only is only in one availability zone to do some testing, or bring it up in three for production. There are still a few other things that I am hoping to figure out, such as how to not set environmental variables for the second and third subnets even though you don’t need them and how to let Terraform deployments that build on the base to use the right NAT subnets if they use three AZs and the their are only two NATS. You can find the code in this repo.

Using Vault to Manage AWS Accounts

Using Vault to Manage AWS Accounts

I’ve been putting off setting up the AWS backend in our Vault server for the last few months. I knew that I was going to need it eventually, but other priorities kept taking precedence so I have been pushing it off. This past week one of the application teams came to me with a requirement for needing to write a file to an S3 bucket.

Under normal circumstances, I would probably just go ahead and create an instance profile that could be applied to the system and the problem would be solved. The problem with this approach was that they wanted did not want other applications to be able to access it. Since we run containers, using instance profiles to control access would allow every container on that host access to the buckets.

Preparing the Environment

Setting up the AWS backend is pretty straight forward. To begin, you need to configure your environment to be able to interact with Vault.

export VAULT_ADDR=vault.example.com:8200
export VAULT_TOKEN=a38dc275-86d3-48bd-57ae-237a45d6663b

Once set, you can test your configuration using the curl command to go to the health endpoint.

% curl -k -X GET ${VAULT_ADDR}/v1/sys/health
{"initialized":true,"sealed":false,"standby":false,"server_time_utc":1477441389,"version":"0.6.2","cluster_name":"vault-cluster-2fbd0333","cluster_id":"d8056c7f-acbb-ae59-4ed4-3673f2d27d48"}

Initialize the AWS Backend

Once you have verified that the endpoint is working, you can create and configure the AWS Backend. Since we use multiple AWS accounts for each environment, I will mount different backends for each account.

curl -k -X POST -H "x-Vault-Token: ${VAULT_TOKEN}" -d '{"type": "aws",   "description": "AWS Backend", "config": {"default_lease_ttl": "360", "max_lease_ttl": "720"}}'  ${VAULT_ADDR}/v1/sys/mounts/aws-prototype

This command sets up the aws-prototype backend with a default lease time of 5 minutes and a max lease time of 10 minutes. Since the post doesn’t return anything, you can verify it with the mounts endpoint. If you don’t have jq, I highly recommend you download it, as it makes viewing JSON output much easier

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/sys/mounts|jq .

Configure AWS Backend

Once the mount is created, you will need to add AWS credentials to the backend. You will need to create an AWS user that has full IAM access so that it can create other users. We have automation that controls our IAM, but you can use a couple of IAM commands to set up the user you will need.

aws iam create-user --user-name HashiVault
aws iam attach-user-policy --user-name HashiVault --policy-arn arn:aws:iam::aws:policy/IAMFullAccess
aws iam create-access-key --user-name HashiVault

Once you create the access & secret keys, you can use them to configure the AWS backend.

curl -k -X POST -H "x-Vault-Token: ${VAULT_TOKEN}" -d '{"access_key": "XXXXXXXXXXXXXXXXXXXX", "secret_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", "region": "us-east-1"} ${VAULT_ADDR}/v1/aws-prototype/config/root

After the backend is configured, you can start adding roles to Vault. For the S3 access that we want, we need to create a role that creates a user that has the right policy attached.

curl -k -X POST -H "x-Vault-Token: ${VAULT_TOKEN}" -d '{"arn": "arn:aws:iam::aws:policy/AmazonS3FullAccess"}' ${VAULT_ADDR}/v1/aws-prototype/roles/S3-Access

You can verify that it was created properly by curling the endpoint and getting the credentials back.

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/aws-prototype/creds/S3-Access

Now that you are getting credentials, you can repeat the process for every account and/or role that you need to have setup.

Testing the S3 Access

I had a little problem testing my S3 access once I had everything configured. I wrote a quick little one-liner to get my creds and set them to the proper environmental variables.

CREDS=$(curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/aws-prototype/creds/S3-Access);export AWS_ACCESS_KEY_ID=$(echo $CREDS |jq -r .data.access_key);export AWS_SECRET_ACCESS_KEY=$(echo $CREDS |jq -r .data.secret_key)

When I tried to download a file, I received and error.

download failed: s3://mybucket/testfile.txt to ./testfile.txt An error occurred (InvalidArgument) when calling the GetObject operation: Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4. You can enable AWS Signature Version 4 by running the command:

aws configure set s3.signature_version s3v4

I ran the command, but each typed I tried to run the aws s3 command, I received the same error. What I learned was that running the command updated my .aws/config file and added the following line to it:

[default]
s3 =
    signature_version = s3v4

Since I already have a ‘[profile default]’ set with nothing in it, I moved the S3 bits up underneath that and removed the ‘[default]’ block and everything started working as expected.

Getting my First Certification

Getting my First Certification

I’ve been in the tech industry for more than twenty years, and during that time, I have never really thought that getting a certification was necessary to move forward in your career. Experience generally shows through, regardless of whether or not you have a piece of paper that says you are certified with a paticular technology. As a result, I have never really bothered getting certified in the many technologies that I have worked on over the years.

That changed this past Wednesday, when I sat for and passed the AWS Certified SysOps Administrator – Associate certification test. Over the last two years I have been immersed in Amazon Web Services, learning the ins and outs of the various service offerings. I decided at the beginning of this year that I was going to go ahead and get ALL of the AWS certifications. Currently there are 5 but 3 more specialized certifications are on the way (currently in beta).

My plan is to have all of them before I head to Re-Invent this year. I’m looking forward to the opportunties that diving deeper into the technology provides me.

Using Vagrant to Test Galaxy Roles

Using Vagrant to Test Galaxy Roles

Last summer, I wrote a post about how we were using Vagrant to test Ansible roles across AWS and the Datacenter. This has worked well with a single AWS account, but it has proven to be a little trickier in our account layout, which uses a centralized account and STS roles. Initially, I had an assume_role script that we had written that would get and set the right bits for the vagrant file to work, but it wasn’t very elegant.

While working on some Ansible roles recently, I decided to take an afternoon and see what I could come up with to make everything a little easier. I’m pretty happy with the results. It’s much more streamlined and easier to run and maintain.

If you would like to try it out yourself, you can start by cloning (or forking first) the repo to your local system:

git clone git@github.com:MarsDominion/vagrant-ansible-testing.git

Once you have it cloned, you will want to change directories and then checkout the sts branch:

cd vagrant-ansible-testing
git checkout sts

From here, you will need to create an env.rb file in the top level of the directory and add the environmental variables you will want to use:

ENV['AWS_ACCESS_KEY_ID'] = 'XXXXXXXXXXXXXXXXXXXX'
ENV['AWS_SECRET_ACCESS_KEY'] = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
ENV['AWS_KEYPAIR_NAME'] = 'my-keypair'
ENV['MY_PRIVATE_AWS_SSH_KEY_PATH'] = '/Users/me/.ssh/my-keypair.pem'
ENV['AWS_SUBNET'] = 'subnet-xxxxxxxx'
ENV['AWS_SG'] = 'sg-xxxxxxxx' 

You can also optionally define the following variables (defaults are listed):

ENV['AWS_DEFAULT_REGION'] = 'us-east-1'
ENV['AWS_INSTANCE_TYPE'] = 't2.micro'
ENV['AWS_AMI'] = 'ami-9be6f38c' #(aws-linux)
ENV['AWS_EC2_USER'] = 'ec2-user'

After you have saved the env.rb file, you can update the requirements.yml and playbook.yml with your Ansible code. Vagrant will run the ansible-galaxy command with the -f (force) option on the “up” and “provision” vagrant sub commands. Once you have your Ansible files the way you want, all that is left is to run the vagrant command:

% vagrant up

You can iterate on your Ansible by running the vagrant command:

% vagrant provision

Once you have completed your Ansible testing, you can destroy the environment:

% vagrant destroy

That’s it. A quick and easy way to test your Ansible roles against an AWS server.

Authenticating Vault Using GitHub

Authenticating Vault Using GitHub

I have never been a big fan of creating and managing users on individual systems. I much prefer some sort of centralization of credentials, preferably that somebody else manages when people come and go. That is one of the key reasons I wanted to get the GitHub auth backend up and working in Vault.

Preparing the Environment

Setting up the GitHub authentication backend is pretty straight forward. The most difficult part was digging into how the policies work so that the teams that I add from GitHub have the right permissions. To begin, you need to enable the setup your environment.

export VAULT_ADDR=vault.example.com:8200
export VAULT_TOKEN=a38dc275-86d3-48bd-57ae-237a45d6663b

Once set, you can test your configuration by using the curl command to go to the health endpoint.

% curl -k -X GET ${VAULT_ADDR}/v1/sys/health
{"initialized":true,"sealed":false,"standby":false,"server_time_utc":1477441389,"version":"0.6.2","cluster_name":"vault-cluster-2fbd0333","cluster_id":"d8056c7f-acbb-ae59-4ed4-3673f2d27d48"}

Initialize the GitHub Auth backend

Once you have verified that the endpoint is working, you can create and configure the auth backend.

curl -k -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "type": "github", "description": "Github OAuth Backend" }' $VAULT_ADDR/v1/sys/auth/github

You can verify that the backend was created successfully by doing a GET against sys/auth. If you don’t have jq, I highly recommend you download it, as it makes viewing JSON output much easier.

curl -k -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/sys/auth|jq .

You should see the github backend in the output. Once you have verified that it has been created, the next step is to configure the backend by adding the GitHub organization that you will be authenticating against.

curl -k -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "organization": "yourorghere" }' $VAULT_ADDR/v1/auth/github/config

Configure GitHub Team

Next you will need to create a policy that will allow you to actually do something (Deny is the default). This is my initial policy, and I’m sure it is not a great policy, but it is only a POC. Create a file called admin.hcl with the following code.

Once the file has been created, it needs to be uploaded to the server. That can be done through the sys/policy endpoint.

curl -k -X PUT -H "X-Vault-Token: $VAULT_TOKEN" -d @<(jq -n --arg a "$(<./admin.hcl)" '{ "rules": $a }')  $VAULT_ADDR/v1/sys/policy/admin

You can validate it by doing a GET against the same endpoint.

curl -k -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/sys/policy/admin

Once the policy is uploaded, you can map it to a team in GitHub.

curl -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "value": "admin" }' $VAULT_ADDR/v1/auth/github/map/teams/myteam

Verify Everything Works

Now you can test to ensure that everything works properly. Head over to Github and generate a Personal access token and then try to authenticate against Vault.

curl $VAULT_ADDR/v1/auth/github/login -d '{ "token": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" }'|jq .

This will return JSON that will give you a client_token you can use to access vault.

To make it easy, you could set your VAULT_TOKEN with the curl command.

export VAULT_TOKEN=$(curl ${VAULT_ADDR}/v1/auth/github/login -d '{ "token": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" }'|jq  -r .auth.client_token)

And then test that you are connecting properly to the system.

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/sys/mounts|jq .

Now you can set up other teams with more restricted access.

Looking at Open Source Cloud Foundry

Looking at Open Source Cloud Foundry

We have been using Pivotal’s version of Cloud Foundry for the past year and while it has a lot of nice features around it, there are a couple of things about it that I have found rather frustrating. Unfortunately, it is the very things that Pivotal adds on top of Cloud Foundry that I find the most frustrating.

The first and biggest frustration is that we haven’t been able to figure out how to effectively automate the deployment of the cloud foundry environment. While they do provide a Cloud Formation template that will build out the correct AWS bits, we found it wasn’t very good overall and ended up rewriting most of it to add some pretty important bits such as allowing us to pick our own IP addresses, encrypting the databases, and building out the subnets in multiple availability zones.

Once the cloud formation template is run and the Ops Manager is running, they provide the ability to deploy the Ops Manager Director (Bosh) tile that will then allow you to deploy the Elastic Runtime tile (Cloud Foundry). To deploy these tiles, you must click through a number of web forms and fill in the values that you want to use. While this may work well for the novice, I want to be able to deploy a cloud foundry from scratch using automation, not clicking through a bunch of web forms.

The web forms also point to the second big complaint (which admittedly may be a feature for some) is how much Pivotal obfuscates the inner workings of Cloud Foundry. Initially we took advantage of this when deploying the app, but as we ran it and the need for troubleshooting came up it became much more annoying. Not knowing where to go to look at logs, how to log in (properly) and how to check on system health became more of a problem as we started to run more production workloads.

So we have decided to look at standing up an open source Cloud Foundry environment to see if having more direct control over the pieces will allow us to better automate and support our infrastructure. The tools that I have chosen to build out our proof of concept (POC) environment are: Terraform, Ansible, and Jenkins. I’ll be using a lot of the hacks and tricks that we have learned over the last year in our Pivotal environment.

First Impressions of OneNote

First Impressions of OneNote

I am always on the lookout for ways to improve my system for keeping on top of all the information that I have and staying focused on getting things done. With the announcement at the end of June by Evernote regarding their pricing and plan changes I have been kicking around whether or not to give OneNote a shot as my exclusive knowledge capture device. So far, the results have been mixed. It has solved a few problems that I had with Evernote, but introduced some new ones that can be a little frustrating.

Getting it setup to work on all my devices seemed to be a little bit of a pain in the ass. I have a Macbook Pro, an iPhone 6 Plus, and an iPad Pro 9.7 that it needs to work on. For the first two days it seemed like every time I touched either my iPhone or iPad I had to enter my password or it would have some type of sync problem. It seemed to clear up after the first two days, but it was almost a show stopper even before I got started.

From an interface perspective, I like it a lot better than Evernote. It has more functionality and looks a lot cleaner. One of the big upsides is its support for the Apple Pencil. Unlike with Evernote, I can write anywhere on my notes and still actually see what else is on them. I also like that it is “free” since it uses my 1TB OneDrive storage that I already get as part of my organizational account.

The biggest problem that I had was with their email to OneNote functionality. Unlike Evernote, which gives you a personalized email address you can send to, with OneNote you send it to a single address which then routes it to your account based on the from address of your account. They say that you can add more to your account if you are using a Microsoft account, since I am using an organizational account I cannot seem to add other addresses to my acceptable list, which means now whenever I forward an email to OneNote I have to remember to switch my email’s from address to the account that it allows.

The only other main problem that I had with OneNote is that it won’t allow me to have more than one note open at a time. This is a feature that I find extremely useful in Evernote, especially when I want to merge incident notes with general knowledge notes or I need to reference something I wrote for one thing while I am writing something else. You can get around it by opening a version using the OneNote app and opening the same folder in the online app via your browser. It works, but not very efficient.

I’m going to continue to use OneNote exclusively through the end of September before I make a decision whether to stay with it or switch back to Evernote. I’m interested to know how well it tracks receipts, bills, and budgets.

DevOps and Security

DevOps and Security

For the last few days, I have been participating in a series of internal meetings about how the company is approaching the cloud and DevOps. A good number of the sessions were either about security or contained some reference to security as part of the discussion. With these conversations still fresh in my head, I came across an interesting article at devops.com by Joe Franscella titled The DevOps Force Multiplier: Competitive Advantage + Security.

In the article, Franscella talks with OJ Reeves, a Bugcrowd security researcher, who points out that he has seen that companies who have a DevOps mindset are often more security focused. He cites a number of factors that could explain why, including that they do a better job of checking the security boxes, make fewer mistakes, and that they communicate better. I certainly agree that communication is a key component and one that helps improve security. However, as a change leader helping to implement DevOps, I’m not sure that I would necessarily agree with the first two – at least not as they are described.

DevOps Checks Boxes

Saying that the DevOps does a better job checking the security boxes may seem true on the surface, but it is extremely vague and if you don’t understand why this seems to be the case you are likely to miss the benefits of it. From my standpoint, one of the key reasons that we tend to do a better job checking the boxes than the traditional Ops side is that we have to think about things much more broadly.

When I was a system administrator building production servers, access was restricted to a handful of like minded teammates. I didn’t have to worry about people needing different levels of access and permissions to do different things. On the DevOps side, I do have to think about these things, and more. One of the biggest side benefits of figuring out how to keep the servers safe from developers is that it also protects it from a lot of the external threats as well.

Making Fewer Mistakes

I would never claim that companies that practice DevOps make fewer mistakes, but I could see how it could look that way to an outsider. I think instead the key point is that when mistakes are made, they are much easier to fix than they are in traditional organizations. Why? Automation. When a mistake in configuration is found, or a change or patch needs to be implemented, all that is generally required is a modification to a configuration management tool or script and within a few minutes any mistakes or problems are solved.

Automation is probably one of the biggest factors in Reeves’ findings regarding DevOps organization. With Automation, it is much easier to weave security into the DNA of what a company is doing, not just to have it as an afterthought.

Testing Ansible Galaxy Roles

Testing Ansible Galaxy Roles

With the push to move our roles to Ansible Galaxy as much as possible, we needed to come up with a good way to test the roles as we write them. Up until now, we would build and test them completely within Ansible against the specific system type that we planned to run on. While this works ok against the focused roles that we were writing, it doesn’t work very well for generalized roles that are expected to run on the many different Linux distributions that we run at Blackbaud.

To solve this, we have come up with a Vagrant configuration that allows us to test against multiple OSs both locally (via VirtualBox or VMware) or in the cloud (AWS). You can check out code here. To get started, simple clone the project to you your local machine.

git clone git@github.com:MarsDominion/vagrant-ansible-testing.git

The Vagrantfile in the master branch provides three test environments: aws-linux, centos7, and ubuntu. The aws-linux role will build an Amazon Linux host in AWS while the CentOS and Ubuntu nodes environments are vmware_desktop based nodes that are pulled from Atlas. This gives me a way to test our roles against both cloud and local instances. If you don’t have VMware Fusion or Workstation, you can change the provider from vmware_desktop to virtualbox and they should work as well.

Before launching the instances, you need to download your ansible roles to run. This is done with the ansible-galaxy command.

% ansible-galaxy install blackbaud.linux-hardening

And then update your playbook to include the roles:

- hosts: all
   become: true
   roles:
     - blackbaud.linux-hardening

Finally, set some variables to be able to connect to your Amazon Environment:

AWS_ACCESS_KEY_ID=KIAI3XQCPIPKSDJHSVQ
AWS_SECRET_ACCESS_KEY=onX5HfdsIpasdH6+E+JJCgNxIfzJWY1btZgU4LfQ
AWS_KEYPAIR_NAME=test_key
MY_PRIVATE_AWS_SSH_KEY_PATH=$HOME/.ssh/test_key.pem

Now we are ready to test the

vagrant up
# Brings up all three instances and tests

vagrant up <aws-linux|centos7|ubuntu>
# Brings up the specified instance and tests

It will launch each instance and run through the the Ansible on each node and show you the results. It will jump right into the next node when it completes the previous one, so keep an eye on the output to see the results. When you are done, you can simply destroy the nodes.

vagrant destroy -f