Its that time of the release cycle where you need to see if your system will actually stand up to people using it. Ideally this is the sort of thing you do as you are developing the solution (so the performance and load tests grow and evolve in parallel with the solution), but I didn’t get that luxury this time.
Now that I’m doing something new, that means another serious of blog posts about things that I didn’t understand before, and now partially understand! Mmmmmm, partial understanding, the goal of true champions.
At a high level, I want to break our new API. I want to find the point at which it just gives up, because there are too many users trying to do things, and I want to find out why it gives up. Then I want to fix that and do the whole process again, hopefully working my way up to how many users we are expecting + some wiggle room.
Over the next couple of weeks I’m going to be talking about JMeter (our load testing application of choice), Nxlog (a log shipping component for Windows), ELK (our log aggregator), AWS (not a huge amount) and some other things that I’ve done while working towards good, reproducible load tests.
Todays topic is something that has bugged me for a few weeks, ever since I implemented shipping logs from our API instances into our log aggregator.
Our log shipper of choice is Nxlog, being that we primarily use Windows servers. The configuration for Nxlog, the part that says what logs to process and how and where to send them, is/was installed at initialization time for the AWS instances inside our auto scaling group. This seemed like a good idea at the time, and it got us a working solution fairly quickly, which is always helpful.
Once I started actually using the data in the log aggregator though, I realised very quickly that I needed to make fairly frequent changes to the configuration, to either add more things being logged (windows event log for example) or to change fields or fix bugs. I needed to be able to deploy changes, and I didn’t want to have to do that by manually copying configuration files directly onto the X number of machines in the group. I certainly didn’t want to recreate the whole environment every time I wanted to change the log configuration either.
In essence, the configuration quickly became very similar to a piece of software. Something that is deployed, rather than just incorporated into the environment once when it is created. I needed to be able to version the configuration, easily deploy new versions and to track what versions were installed where.
Sounds like a job for Octopus Deploy.
Many Gentle Yet Strong Arms
I’ve spoken about Octopus Deploy a few times before on this blog, but I’ll just reiterate one thing.
Its goddamn amazing.
Honestly, I’m not entirely sure how I lived without it. I assume with manual deployment of software and poor configuration management.
We use Octopus in a fairly standard way. When the instances are created in our environment they automatically install an Octopus Tentacle, and are tagged with appropriate roles. We then have a series of projects in Octopus that are are configured to deploy various things to machines in certain roles, segregated by environment. Nothing special, very standard Octopus stuff.
If I wanted to leverage Octopus to manage my log configuration, I would need to package up my configuration files (and any dependencies, like Nxlog itself) into NuGet packages. I would then have to configure at least one project to deploy the package to the appropriate machines.
Tasty Tasty Nougat
I’ve used NuGet a lot before, but I’ve never really had to manually create a package. Mostly I just publish .NET libraries, and NuGet packaging is well integrated into Visual Studio through the usage of .csproj files. You don’t generally have to think about the structure of the resulting package much, as its taken care of by whatever the output of the project is. Even when producing packages for Octopus, intended to be deployed instead of just referenced from other projects, Octopack takes are of all the heavy lifting, producing a self contained package with all of the dependencies inside.
This time, however, there was no project.
I needed to package up the installer for Nxlog, along with any configuration files needed. I would also need to include some instructions for how the files should be deployed.
Luckily I had already written some scripts for an automated deployment of Nxlog, which were executed as part of environment setup, so all I needed to do was package all of the elements together, and give myself a way to easily create and version the resulting package (so a build script).
I created a custom nuspec file to define the contents of the resulting package, and relied on the fact that Octopus runs appropriately named scripts at various times during the deployment lifecycle. For example, the Deploy.ps1 script is run during deployment. The Deploy script acts as an entry point and executes other scripts to silently install Nxlog, modify and copy the configuration file to the appropriate location and start the Nxlog service.
Alas Octopus does not appear to have any concept of uninstallation, so there is no way to include a cleanup component in the package. This would be particularly useful when deploying new versions of the package that now run a new version of Nxlog for example. Instead of having to include the uninstallation script inside the new package, you could deploy the original package with clean up code already encapsulated inside, ready to be run when uninstallation is called for (for example, just before installing the new package).
There was only one issue left to solve.
How to select the appropriate configuration file based on where the package is being deployed? I assumed that there will be multiple configuration files, because different machines require different logging. This is definitely not a one-size fits all kind of thing.
In the end, the way in which I solved the configuration selection problem was a bit hacky. It worked (hacky solutions usually do), but it doesn’t feel nice and clean. I generally shy away from solutions like that, but as this continuous log configuration deployment was done while I was working on load/performance testing, I had to get something in place that worked, and I didn’t have time to sand off all the rough edges.
Essentially each unique configuration file gets its own project in Octopus, even though they all reference the same NuGet package. This project has a single step, which just installs the package onto the machines with the specified roles. Since machines could fill multiple roles (even if they don’t in our environments) I couldn’t use the current role to select the appropriate configuration file.
What I ended up doing was using the name of the project inside the Deploy script. The project name starts with NXLOG_ and the last part of the name is the name of the configuration file to select. Some simple string operations very quickly get the appropriate configuration file, and the project can target any number of roles and environments as necessary.
It definitely isn't the greatest solution, but it gets the job done. There are some other issues that I can foresee as well, like the package being installed multiple times on the same machine through different projects, which definitely won’t give the intended result. That's partially a factor of the way Nxlog works though, so I’m not entirely to blame.
In the end I managed to get something together that accomplished my goal. With the help of the build script (which will probably be incorporated into a TeamCity Build Configuration at some stage), I can now build, version and optionally deploy a package that contains Nxlog along with automated instructions on how and when to install it, as well as configuration describing what it should do. I can now make log changes locally on my machine, commit them, and have them appear on any number of servers in our AWS infrastructure within moments.
I couldn’t really ask for anything more.
I’ve already deployed about 50 tweaks to the log configuration since I implemented this continuous deployment, as a result of needing more information during load and performance testing, so I definitely think it has been worthwhile.
I like that the log configuration isn’t something that is just hacked together on each machine as well. Its controlled in source and we have a history of all configurations that were previously deployed, and can keep an eye on what package is active in what environments.
Treating the log configuration like I would treat an application was definitely the right choice.
I’ve knocked together a repository containing the source for the Nxlog package and all of the supporting elements (i.e. build scripts and tests). As always, I can’t promise that I will maintain it moving forward, as the actual repository I’m using is in our private BitBucket, but this should be enough for anyone interested in what I’ve written about above.
Next time: Load testing and JMeter. I write Java for the first time in like 10 years!