0 Comments

Its time to fix that whole shared infrastructure issue.

I need to farm my load tests out to AWS, and I need to do it in a way that won’t accidentally murder our current production servers. In order to do that , I need to forge the replacement for our manually created and configured proxy box. A nice, codified, auto-scaling proxy environment.

Back into CloudFormation I go.

I contemplated simply using a NAT box instead of a proxy, but decided against it because:

  • We already use a proxy, so assuming my template works as expected it should be easy enough to slot in,
  • I don’t have any experience with NAT boxes (I’m pretty weak on networking in general actually),
  • Proxies scale better in the long run, so I might as well sort that out now.

Our current lone proxy machine is a Linux instance with Squid manually installed on it. It was setup some time before I started, by someone who no longer works at the company. An excellent combination, I’m already a bit crap at Linux, and now I can’t even ask anyone how it was put together and what sort of tweaks were done to it over time as failures were encountered. Time to start from scratch. The proxy itself is sound enough, and I have some experience with Squid, so I’ll stick with it. As for the OS, while I know that Linux will likely be faster with less overhead, I’m far more comfortable with Windows, so to hell with Linux for now.

Here's the plan. Create a CloudFormation template for the actual environment (Load Balancer, Auto Scaling Group, Instance Configuration, DNS Record) and also create a NuGet package that installs and configures the proxy to be deployed via Octopus.

I’ve always liked the idea of never installing software manually, but its only been recently that I’ve had access to the tools to accomplish that. Octopus, NuGet and Powershell form a very powerful combination for managing deployments on Windows. I have no idea what the equivalent is for Linux, but I’m sure there is something. At some point in the future Octopus is going to offer the ability to do SSH deploys which will allow me to include more Linux infrastructure (or manage existing Linux infrastructure even better, I’m looking at you ELK stack).

Save the Environment

The environment is pretty simple. A Load Balancer hooked up to an Auto Scaling Group, whose instances are configured to do some simple setup, (including using Octopus to deploy some software) and a DNS record so that I can refer to the load balancer in a nice way.

I’ve done enough of these simple sorts of environments now that I didn’t really run into any interesting issues. Don’t get me wrong, they aren’t trivial, but I wasn’t stuck smashing my head against a desk for a few days while I sorted out some arcane problem that ended up being related to case sensitivity or something ridiculous like that.

One thing that I have learned, is to setup the Octopus project that will be deployed during environment setup ahead of time. Give it some trivial content, like running a Powershell script, and then make sure it deploys correctly during the startup of the instances in the Auto Scaling Group. If you try to sort out the package and its deployment at the same time as the environment, you’ll probably run into issues where the environment setup technically succeeded, but because the deployment of the package failed, the whole thing failed and you have to wait another 20 minutes to fix it. It really saves a lot of time to create the environment in such a way that you can extend it with deployments later.

Technically you could also make it so failing deployments don’t fail an environment setup, but I like my environments to work when they are “finished”, so I’m not really comfortable with that in the long run.

The only tricky things about the proxy environment are making sure that you setup your security groups appropriately so that the proxy port can be accessed, and making sure that you use the correct health check for the load balancer (for Squid at least, TCP over the port 3128 (the default for Squid) is a good health check).

That's a Nice Package

With the environment out of the way, its time to setup the package that will be used to deploy Squid.

Squid is available on Windows via diladele. Since 99% of our systems are 64 bit, I just downloaded the 64 bit MSI. Using the same structure that I used for the Nxlog package, I packaged up the MSI and some supporting scripts, making sure to version the package appropriately. Consistent versioning is important, so I use the same versioning strategy that I use for our software components. Include a SharedAssemblyInfo file and then mutate that file via some common versioning Powershell functions.

Apart from installation of Squid itself, I also included the ability to deploy a custom configuration file. The main reason I did this was so that I could replicate out current Squid proxy config exactly, because I’m sure it does things that have been built up over the last few years that I don’t understand. I did this in a similar way to how I did config deployment for Nxlog and Logstash. Essentially a set of configuration files are included in the Nuget package and then the correct one is chosen at deployment time based on some configuration within Octopus.

I honestly don’t remember if I had any issues with creating the Squid proxy package, but I’m sure if I had of, they would be fresh in my mind. MSI’s are easy to install silently with MSIEXEC once you know the arguments, and the Squid installer for Windows is pretty reliable. I really do think it was straightforward, especially considering that I was following the same pattern that I’d used to install an MSI via Octopus previously.

Delivering the Package

This is standard Octopus territory. Create a project to represent the deployable component, target it appropriately at machines in roles and then deploy to environments. Part of the build script that is responsible for putting together the package above can also automatically deploy it via Octopus to an environment of your choice.

In TeamCity, we typically do an automatic deploy to CI on every checkin (gated by passing tests), but for this project I had to hold off. We’re actually running low on Build Configurations right now (I’ve already put in a request for more, but the wheels of bureaucracy move pretty slow), so I skipped out on setting one up for the Squid proxy. Once we get some more available configurations I’ll rectify that situation.

Who Could Forget Logs?

The final step in deploying and maintaining anything is to make sure that the logs from the component are being aggregated correctly, so that you don’t have to go the machine/s in question to see what's going on and you have a nice pile of data to do analysis on later. Space is cheap after all, so you might as well store everything, all the time (except media).

Squid features a nice Access log with a well known format, which is perfect for this sort of log processing and aggregation.

Again, using the same sort of approach that I’ve used for other components, I quickly knocked up a logstash config for parsing the log file and deployed it (and Logstash) to the same machines as the Squid proxy installations. I’ll include that config here, because it lives in a different repository to the rest of the Squid stuff (it would live in the Solavirum.Logging.Logstash repo, if I updated it).

input {
    file {
        path => "@@SQUID_LOGS_DIRECTORY/access.log"
        type => "squid"
        start_position => "beginning"
        sincedb_path => "@@SQUID_LOGS_DIRECTORY/.sincedb"
    }
}

filter {
    if [type] == "squid" {
        grok {
            match => [ "message", "%{NUMBER:timestamp}\s+%{NUMBER:TimeTaken:int} %{IPORHOST:source_ip} %{WORD:squid_code}/%{NUMBER:Status} %{NUMBER:response_bytes:int} %{WORD:Verb} %{GREEDYDATA:url} %{USERNAME:user} %{WORD:squid_peerstatus}/(%{IPORHOST:destination_ip}|-) %{GREEDYDATA:content_type}" ]
        }
        date {
            match => [ "timestamp", "UNIX" ]
            remove_field => [ "timestamp" ]
        }
    }
     
    mutate {
        add_field => { "SourceModuleName" => "%{type}" }
        add_field => { "Environment" => "@@ENVIRONMENT" }
        add_field => { "Application" => "SquidProxy" }
        convert => [ "Status", "string" ]
    }
    
    # This last common mutate deals with the situation where Logstash was creating a custom type (and thus different mappings) in Elasticsearch
    # for every type that came through. The default "type" is logs, so we mutate to that, and the actual type is stored in SourceModuleName.
    # This is a separate step because if you try to do it with the SourceModuleName add_field it will contain the value of "logs" which is wrong.
    mutate {
        update => [ "type", "logs" ]
    }
}

output {
    tcp {
        codec => json_lines
        host => "@@LOG_SERVER_ADDRESS"
        port => 6379
    }
    
    #stdout {
    #    codec => rubydebug
    #}
}

I Can Never Think of a Good Title for the Summary

For reference purposes I’ve included the entire Squid package/environment setup code in this repository. Use as you see fit.

As far as environment setups go, this one was pretty much by the numbers. No major blockers or time wasters. It wasn’t trivial, and it still took me a few days of concentrated effort, but the issues I did have were pretty much just me making mistakes (like setting up security group rules wrong, or failing to tag instances correctly in Octopus, or failing to test the Squid install script locally before I deployed it). The slowest part is definitely waiting for the environment creation to either succeed or fail, because it can take 20+ minutes for the thing to run start to finish. I should look into making that faster somehow, as I get distracted during that 20 minutes.

Really the only reason for the lack of issues was that I’d done all of this sort of stuff before, and I tend to make my stuff reusable. It was a simple matter to plug everything together in the configuration that I needed, no need to reinvent the wheel.

Though sometimes you do need to smooth the wheel a bit when you go to use it again.