Authenticating Vault Using GitHub

Authenticating Vault Using GitHub

I have never been a big fan of creating and managing users on individual systems. I much prefer some sort of centralization of credentials, preferably that somebody else manages when people come and go. That is one of the key reasons I wanted to get the GitHub auth backend up and working in Vault.

Preparing the Environment

Setting up the GitHub authentication backend is pretty straight forward. The most difficult part was digging into how the policies work so that the teams that I add from GitHub have the right permissions. To begin, you need to enable the setup your environment.

export VAULT_ADDR=vault.example.com:8200
export VAULT_TOKEN=a38dc275-86d3-48bd-57ae-237a45d6663b

Once set, you can test your configuration by using the curl command to go to the health endpoint.

% curl -k -X GET ${VAULT_ADDR}/v1/sys/health
{"initialized":true,"sealed":false,"standby":false,"server_time_utc":1477441389,"version":"0.6.2","cluster_name":"vault-cluster-2fbd0333","cluster_id":"d8056c7f-acbb-ae59-4ed4-3673f2d27d48"}

Initialize the GitHub Auth backend

Once you have verified that the endpoint is working, you can create and configure the auth backend.

curl -k -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "type": "github", "description": "Github OAuth Backend" }' $VAULT_ADDR/v1/sys/auth/github

You can verify that the backend was created successfully by doing a GET against sys/auth. If you don’t have jq, I highly recommend you download it, as it makes viewing JSON output much easier.

curl -k -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/sys/auth|jq .

You should see the github backend in the output. Once you have verified that it has been created, the next step is to configure the backend by adding the GitHub organization that you will be authenticating against.

curl -k -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "organization": "yourorghere" }' $VAULT_ADDR/v1/auth/github/config

Configure GitHub Team

Next you will need to create a policy that will allow you to actually do something (Deny is the default). This is my initial policy, and I’m sure it is not a great policy, but it is only a POC. Create a file called admin.hcl with the following code.

Once the file has been created, it needs to be uploaded to the server. That can be done through the sys/policy endpoint.

curl -k -X PUT -H "X-Vault-Token: $VAULT_TOKEN" -d @<(jq -n --arg a "$(<./admin.hcl)" '{ "rules": $a }')  $VAULT_ADDR/v1/sys/policy/admin

You can validate it by doing a GET against the same endpoint.

curl -k -X GET -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/sys/policy/admin

Once the policy is uploaded, you can map it to a team in GitHub.

curl -X POST -H "X-Vault-Token: $VAULT_TOKEN" -d '{ "value": "admin" }' $VAULT_ADDR/v1/auth/github/map/teams/myteam

Verify Everything Works

Now you can test to ensure that everything works properly. Head over to Github and generate a Personal access token and then try to authenticate against Vault.

curl $VAULT_ADDR/v1/auth/github/login -d '{ "token": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" }'|jq .

This will return JSON that will give you a client_token you can use to access vault.

To make it easy, you could set your VAULT_TOKEN with the curl command.

export VAULT_TOKEN=$(curl ${VAULT_ADDR}/v1/auth/github/login -d '{ "token": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" }'|jq  -r .auth.client_token)

And then test that you are connecting properly to the system.

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/sys/mounts|jq .

Now you can set up other teams with more restricted access.

Using Vault as a Certificate Authority

Using Vault as a Certificate Authority

For the next few weeks we are doing a POC on Hashicorp’s Vault. While I am still learning about all of the functionality that Vault provides, there are a few key pieces I have already identified to check out in addition to just storing credentials. One of the big ones is the PKI backend. This would make it a lot easier for not just my team, but developers as well to generate SSL certificates. While I found some basic instructions on how to set it up from various sources (mentioned later), I decided to do my own write-up that would consolidate everything I learned.

Create the Root and Intermediate Certificates

Rather than writing up instructions on how to create a the root or intermediate CA’s, I will just post the instructions that I followed that were written up by Jamie Nguyen entitled OpenSSL Certificate Authority. For the purposes of this document, I followed the sections called Create the root pair and Create the intermediate pair.

It’s also possible to create a root certificate authority (and/or an intermediary certificate), but I prefer to do this outside of Vault so that my root certificate remains secure.

Initialize the PKI Backend

Once you have your root and intermediate certificates generated, the first think that you want to do is prepare them for upload to Vault. You can do that with by combining the intermediate certificate with your key.

cat intermediate/certs/ca-chain.cert.pem > /tmp/ca_bundle.pem
openssl rsa -in intermediate/private/intermediateCA.key.pem >> /tmp/ca_bundle.pem

These two commands will concatenate everything into one file called ca_bundle.pem. The next step is to initialize and configure the PKI backend. I was able to find some pretty good instructions on configuring it between Cuddletech’s website and a post by Joel Bastos. But since most of what I want to use Vault for will be driven from automation, I decide to focus on utilizing only the API (which made things just a little tougher).

The first step is to initialize some environmental variables that will make our commands easier to run. You’ll want to set the VAULT_ADDR to your URL of your vault server and VAULT_TOKEN to your login token.

export VAULT_ADDR=vault.example.com:8200
export VAULT_TOKEN=a38dc275-86d3-48bd-57ae-237a45d6663b

Once set, you can test your configuration using the curl command to go to the health endpoint.

% curl -k -X GET ${VAULT_ADDR}/v1/sys/health
{"initialized":true,"sealed":false,"standby":false,"server_time_utc":1477441389,"version":"0.6.2","cluster_name":"vault-cluster-2fbd0333","cluster_id":"d8056c7f-acbb-ae59-4ed4-3673f2d27d48"}

Configure the PKI Backend

After you have verified that the endpoint works, you can create and configure your PKI backend.

curl -k -X POST-H "x-Vault-Token: ${VAULT_TOKEN}" -d '{"type": "pki",   "description": "Test Root CA", "config": { "max_lease_ttl":     "87600h"}}'  ${VAULT_ADDR}/v1/sys/mounts/pki-test

This command creates a new PKI backend mount called “pki-test” and sets the max_lease_ttl to 10 years. You may want to adjust these settings to whatever is suitable for your environment.
Since the post doesn’t return anything, you can verify it with the mounts endpoint. If you don’t have jq, I highly recommend you download it, as it makes viewing JSON output much easier

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/sys/mounts|jq .

Once you have initialized the backend, you can upload the certificate bundle that you created by following the instructions noted above.

curl -k -X POST -H "x-Vault-Token: ${VAULT_TOKEN}" -d @<(jq -n --arg a "$(</tmp/ca_bundle.pem)" '{ pem_bundle: $a }') ${VAULT_ADDR}/v1/pki-test/config/ca

This command doesn’t return anything either. You can verify that it uploaded properly by trying to download the intermediate certificate.

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/pki-test/ca/pem

Create a Role

The final step is to configure a role to issue the certificates.

curl -k -X POST -H "x-Vault-Token: ${VAULT_TOKEN}" -d '{"allow_any_name": "true", "allow_ip_sans": "true", "max_ttl": "17520h"}' ${VAULT_ADDR}/v1/pki-test/roles/example-dot-com

You can verify that the role exists with a GET to the roles endpoint.

curl -k -X GET -H "x-Vault-Token: ${VAULT_TOKEN}" ${VAULT_ADDR}/v1/pki-test/roles/example-dot-com|jq .

Issue Certificates

Now we are all set to issue certificates from our Vault server. This can be done one of two ways. The first is to request a certificate and key from the Vault directly:

curl -H "X-Vault-Token: ${VAULT_TOKEN}"   -d '{ "common_name": "testhost.example.com" }' https://${VAULT_ADDR}/v1/pki-test/issue/example-dot-com | tee >(jq -r .data.certificate > test.example.com.cert) >(jq -r .data.private_key > test.example.com.pem) >(jq -r .data.ca_chain[] > test.example.com-chained.pem)

This will create three files in your directory, one that contains the key, one that contains the certificate, and one that contains the certificate chain. You can also send a CSR that you created to have a certificate generated.

curl -k -X POST -H "X-Vault-Token: ${VAULT_TOKEN}" -d @<(jq -n --arg a "test.example.com" --arg b "$(<../server.csr)" '{ common_name: $a, csr: $b }') ${VAULT_ADDR}/v1/pki-test/sign/example-dot-com| tee >(jq -r .data.certificate > test.example.com.cert) >(jq -r .data.ca_chain[] > test.example.com-chained.pem)

Since the key was generated separately, it won’t create a new key file, but it does generate the certificate file and the certificate chain.

That’s all it takes to get a functioning CA in Vault. I’m sure that I still have a whole lot to learn about configuring and securing the PKI backend, but for our POC I think this will work nicely.

Looping Through Ansible Hosts

Looping Through Ansible Hosts

For the past week I have been working on deploying Hashicorp’s Vault using Terraform and Ansible. As I was installing and configuring the Consul server, I came across an interesting problem with building the server configuration. I’ve been following instructions from DigitalOcean and while most of the configuration has been pretty straight forward, the config.json file proved to be a bit of a challenge.

According to the instructions, for a three node cluster, you only want to put the IP addresses for the other two server into the config.json file for the server, but it took me a little while to figure out how to get Ansible to do that. While it may seem straight forward to others, I had a hard time even finding information on the internet on how to do this, so I figured I would share it here.

My first iteration was to just get it set up to do put all the IP addresses in. This was accomplished by this:

However, this is not what I wanted. I wanted it to exclude the hosts’s IP address. After searching high an low, I finally found a Jinja2 tidbit that would get me what I wanted. I didn’t realize that you could an if right in with the for loop, so I just needed to add “if server != inventory_hostname” into the for loop so that it would exclude the host it was running on.

I can see this little tidbit coming in handy for all sorts of things.

Storing Terraform State Files in S3 When Using assume_role

Storing Terraform State Files in S3 When Using assume_role

We use a lot of different AWS accounts, so rather than managing credentials across all of them we have built a model where one is strictly used for managing users accounts (both locally and via ADFS).  From there, all of our interactive and automated logins use STS to assume roles in other accounts.  As we started to dive more into Terraform, I was excited to find that they supported this with the aws_assume_role resource.

However, as we started to implement this, we quickly ran into a problem: the terraform command itself doesn’t support assuming a role in another account, so we would either need to store the state files in one account (in this case, our auth account), or figure out how to allow the auth account to put the files an S3 bucket of the account we are working with.  since I don’t want to store ANY data in the auth account, I had to go figure out how to give users from my auth account access to the account I am working on.  In the end, it was relatively straight forward.  I just need to add a bucket policy in the target account and a policy in the auth account that I then attached to my team’s user group.

The first step is to create a bucket policy that allows my user to list the contents of the file and to also be able to get and put the state files.  I could probably lock the policy down more, and only restrict it to the terraform-state folder that I have in my bucket, but since I have full access outside of terraform anyways, I didn’t think it was as important.  This is the policy I used:

Once the bucket policy was in place, I added the below role to my auth account and attached it to my user group.  I figure as I put more accounts under Terraform control, I’ll just add additional resources.

Once the two policies were in place, Terraform was able to use the S3 bucket in the account we building out.

Looking at Open Source Cloud Foundry

Looking at Open Source Cloud Foundry

We have been using Pivotal’s version of Cloud Foundry for the past year and while it has a lot of nice features around it, there are a couple of things about it that I have found rather frustrating. Unfortunately, it is the very things that Pivotal adds on top of Cloud Foundry that I find the most frustrating.

The first and biggest frustration is that we haven’t been able to figure out how to effectively automate the deployment of the cloud foundry environment. While they do provide a Cloud Formation template that will build out the correct AWS bits, we found it wasn’t very good overall and ended up rewriting most of it to add some pretty important bits such as allowing us to pick our own IP addresses, encrypting the databases, and building out the subnets in multiple availability zones.

Once the cloud formation template is run and the Ops Manager is running, they provide the ability to deploy the Ops Manager Director (Bosh) tile that will then allow you to deploy the Elastic Runtime tile (Cloud Foundry). To deploy these tiles, you must click through a number of web forms and fill in the values that you want to use. While this may work well for the novice, I want to be able to deploy a cloud foundry from scratch using automation, not clicking through a bunch of web forms.

The web forms also point to the second big complaint (which admittedly may be a feature for some) is how much Pivotal obfuscates the inner workings of Cloud Foundry. Initially we took advantage of this when deploying the app, but as we ran it and the need for troubleshooting came up it became much more annoying. Not knowing where to go to look at logs, how to log in (properly) and how to check on system health became more of a problem as we started to run more production workloads.

So we have decided to look at standing up an open source Cloud Foundry environment to see if having more direct control over the pieces will allow us to better automate and support our infrastructure. The tools that I have chosen to build out our proof of concept (POC) environment are: Terraform, Ansible, and Jenkins. I’ll be using a lot of the hacks and tricks that we have learned over the last year in our Pivotal environment.

First Impressions of OneNote

First Impressions of OneNote

I am always on the lookout for ways to improve my system for keeping on top of all the information that I have and staying focused on getting things done. With the announcement at the end of June by Evernote regarding their pricing and plan changes I have been kicking around whether or not to give OneNote a shot as my exclusive knowledge capture device. So far, the results have been mixed. It has solved a few problems that I had with Evernote, but introduced some new ones that can be a little frustrating.

Getting it setup to work on all my devices seemed to be a little bit of a pain in the ass. I have a Macbook Pro, an iPhone 6 Plus, and an iPad Pro 9.7 that it needs to work on. For the first two days it seemed like every time I touched either my iPhone or iPad I had to enter my password or it would have some type of sync problem. It seemed to clear up after the first two days, but it was almost a show stopper even before I got started.

From an interface perspective, I like it a lot better than Evernote. It has more functionality and looks a lot cleaner. One of the big upsides is its support for the Apple Pencil. Unlike with Evernote, I can write anywhere on my notes and still actually see what else is on them. I also like that it is “free” since it uses my 1TB OneDrive storage that I already get as part of my organizational account.

The biggest problem that I had was with their email to OneNote functionality. Unlike Evernote, which gives you a personalized email address you can send to, with OneNote you send it to a single address which then routes it to your account based on the from address of your account. They say that you can add more to your account if you are using a Microsoft account, since I am using an organizational account I cannot seem to add other addresses to my acceptable list, which means now whenever I forward an email to OneNote I have to remember to switch my email’s from address to the account that it allows.

The only other main problem that I had with OneNote is that it won’t allow me to have more than one note open at a time. This is a feature that I find extremely useful in Evernote, especially when I want to merge incident notes with general knowledge notes or I need to reference something I wrote for one thing while I am writing something else. You can get around it by opening a version using the OneNote app and opening the same folder in the online app via your browser. It works, but not very efficient.

I’m going to continue to use OneNote exclusively through the end of September before I make a decision whether to stay with it or switch back to Evernote. I’m interested to know how well it tracks receipts, bills, and budgets.

DevOps and Security

DevOps and Security

For the last few days, I have been participating in a series of internal meetings about how the company is approaching the cloud and DevOps. A good number of the sessions were either about security or contained some reference to security as part of the discussion. With these conversations still fresh in my head, I came across an interesting article at devops.com by Joe Franscella titled The DevOps Force Multiplier: Competitive Advantage + Security.

In the article, Franscella talks with OJ Reeves, a Bugcrowd security researcher, who points out that he has seen that companies who have a DevOps mindset are often more security focused. He cites a number of factors that could explain why, including that they do a better job of checking the security boxes, make fewer mistakes, and that they communicate better. I certainly agree that communication is a key component and one that helps improve security. However, as a change leader helping to implement DevOps, I’m not sure that I would necessarily agree with the first two – at least not as they are described.

DevOps Checks Boxes

Saying that the DevOps does a better job checking the security boxes may seem true on the surface, but it is extremely vague and if you don’t understand why this seems to be the case you are likely to miss the benefits of it. From my standpoint, one of the key reasons that we tend to do a better job checking the boxes than the traditional Ops side is that we have to think about things much more broadly.

When I was a system administrator building production servers, access was restricted to a handful of like minded teammates. I didn’t have to worry about people needing different levels of access and permissions to do different things. On the DevOps side, I do have to think about these things, and more. One of the biggest side benefits of figuring out how to keep the servers safe from developers is that it also protects it from a lot of the external threats as well.

Making Fewer Mistakes

I would never claim that companies that practice DevOps make fewer mistakes, but I could see how it could look that way to an outsider. I think instead the key point is that when mistakes are made, they are much easier to fix than they are in traditional organizations. Why? Automation. When a mistake in configuration is found, or a change or patch needs to be implemented, all that is generally required is a modification to a configuration management tool or script and within a few minutes any mistakes or problems are solved.

Automation is probably one of the biggest factors in Reeves’ findings regarding DevOps organization. With Automation, it is much easier to weave security into the DNA of what a company is doing, not just to have it as an afterthought.

Testing Ansible Galaxy Roles

Testing Ansible Galaxy Roles

With the push to move our roles to Ansible Galaxy as much as possible, we needed to come up with a good way to test the roles as we write them. Up until now, we would build and test them completely within Ansible against the specific system type that we planned to run on. While this works ok against the focused roles that we were writing, it doesn’t work very well for generalized roles that are expected to run on the many different Linux distributions that we run at Blackbaud.

To solve this, we have come up with a Vagrant configuration that allows us to test against multiple OSs both locally (via VirtualBox or VMware) or in the cloud (AWS). You can check out code here. To get started, simple clone the project to you your local machine.

git clone git@github.com:MarsDominion/vagrant-ansible-testing.git

The Vagrantfile in the master branch provides three test environments: aws-linux, centos7, and ubuntu. The aws-linux role will build an Amazon Linux host in AWS while the CentOS and Ubuntu nodes environments are vmware_desktop based nodes that are pulled from Atlas. This gives me a way to test our roles against both cloud and local instances. If you don’t have VMware Fusion or Workstation, you can change the provider from vmware_desktop to virtualbox and they should work as well.

Before launching the instances, you need to download your ansible roles to run. This is done with the ansible-galaxy command.

% ansible-galaxy install blackbaud.linux-hardening

And then update your playbook to include the roles:

- hosts: all
   become: true
   roles:
     - blackbaud.linux-hardening

Finally, set some variables to be able to connect to your Amazon Environment:

AWS_ACCESS_KEY_ID=KIAI3XQCPIPKSDJHSVQ
AWS_SECRET_ACCESS_KEY=onX5HfdsIpasdH6+E+JJCgNxIfzJWY1btZgU4LfQ
AWS_KEYPAIR_NAME=test_key
MY_PRIVATE_AWS_SSH_KEY_PATH=$HOME/.ssh/test_key.pem

Now we are ready to test the

vagrant up
# Brings up all three instances and tests

vagrant up <aws-linux|centos7|ubuntu>
# Brings up the specified instance and tests

It will launch each instance and run through the the Ansible on each node and show you the results. It will jump right into the next node when it completes the previous one, so keep an eye on the output to see the results. When you are done, you can simply destroy the nodes.

vagrant destroy -f 
Ansible Role Style Guide

Ansible Role Style Guide

Ansible Role Style Guide

In my last post, I discussed how to get started with creating an Ansible Galaxy role. This post will go into more detail on what comprises a roles and how we use it at Blackbaud for building Ansible roles that can be reused and shared throughout the company.

By default, running the ansible-galaxy init command will create the following directory structure:

ansible-role-linux-hardening/
|- defaults/
    |- main.yml
|- handlers/
    |- main.yml
|- meta/
    |- main.yml
|- tasks/
    |- main.yml
|- tests/
    |- inventory
    |- test.yml
|- vars/
    |- main.yml
.travis.yml
README.md

Depending on the role that is being created, it is possible that some of these directories will not be needed. It is recommended that you remove the directories that are not being used. Also, a templates directory isn’t created with ansible-galaxy init, but you can add it if it is necessary.

defaults

The default folder contains the default values of any of the variables that are used throughout the rest of the role (You can learn more about variable precedence here. This should be reserved for variables that mainly don’t often stray from the defined value but that may in certain specific instances.

handlers

The handlers directory contains the process handlers that can be notified when changes occur. For example, you can have a handler that restarts SSH when changes to the configuration file are made:

- name: restart ssh
   service:
     name: "{{ ssh_service}}"
     state: restarted

meta

The meta directory serves two purposes. The first is to define the ansible galaxy role, including the version needed, what platforms are supported, etc, and to define any dependencies that the role may have.

galaxy_info:
  author: Mark Honomichl
  description: Ansible for Hardening a Linux Host
  company: Blackbaud
  license: MIT
  min_ansible_version: 1.2
  platforms:
    - name: EL
      versions:
        - all
    - name: Amazon
      versions:
        - all
    - name: Ubuntu
      versions:
        - all
  galaxy_tags:
    - security
    - linux
dependencies: []

tasks

Tasks is where all the tasks of the role are defined. With larger roles, it is recommended that seperate tasks files are created for similar tasks and then called in the main.yml file via the includes directive.

- name: Update Ubuntu
   apt:
     upgrade: dist
  when: ansible_distribution == "Ubuntu"
  tags: update_os

- include: "hostname.yml"

test

The tests directory is included so that the role can be tested via Travis CI. No changes should need to be made to this directory.

vars

The vars directory is primarily used to define variables that can be different across platforms or environments. For example, in ubuntu the ssh service is called “ssh” while in EL it is called “sshd”. Rather than writing two handlers, we can write it as defined in the above handlers section and then define ssh_service as a varaible in file called Ubuntu.yml and CentOS.yml respectively, and then include the correct file in the tasks section with an includes call.

- name: Include OS Specific Variables
   include_vars: "{{ ansible_distribution }}.yml" 

.travis.yml

The .travis.yml file is generated by default to allow the role to run in Travis CI. By default, it spins up a container, installs ansible, and runs a syntax check against the code. If possible, it is recommended that you actually run the playbook by adding a second ansible-playbook command without the –syntax-check option.

README.md

Once the role is completed, update the README.md file with the information requested. Make sure this document clearly defines how to use the role. Update the Author information with the following:

Blackbaud
Created in 2016 by [Blackbaud](http://blackbaud.com/)
Getting Started with Ansible Galaxy

Getting Started with Ansible Galaxy

I started using Ansible in December when I joined Blackbaud, and while I do feel like the team is doing some really innovative stuff with our Ansible roles, most of the stuff we do is pretty straightforward. After spending the last six months building up a monolithic repository, we are starting to examine how we can break the roles out into Ansible Galaxy roles that can better be shared across the company. I figured since I was documenting how to create a Galaxy compatible role for the company, I would go ahead and share those instructions here as well.

Create a GitHub Repository

The first step is to create an empty git repository on GitHub. This can be done by logging into your account and clicking on the ‘New repository’ button. We prefix all of our roles at Blackbaud with ‘ansible-role-’ so that we can easily distinguish them when looking at the hundreds of repos that we have. For example, the linux-hardening role that we have is called ‘ansible-role-linux-hardening’.
We create the repo without any files (.gitignore, README.me, or LICENSE) so that we can manage all of that properly later. We generally start all of our roles as private so that we can have a chance to build it out and test it before we make it public.

Create the Local Repository

Once our empty repo has been created at GitHub, the next step is to create the Ansible Galaxy skeleton directory structure on a local machine. This is done with the ‘ansible-galaxy’ command.

ansible-galaxy init ansible-role-<ROLE-NAME>

Where <ROLE-NAME> is the name of the role (i.e. linux-hardening). Once the files are created you need to initialize a git repository, add the files, and push them up to GitHub.

cd ansible-role-assume-role
git init
git add .
git commit -m "Commit Skeleton Role"
git remote add origin https://github.com/blackbaud/ansible-role-assume-role.git
git push -u origin master

Now you can begin editing your role.