Automating Functional Tests, Part 01

December 9. 2014 0 Comments

In my last blog post, I mentioned the 3 classifications that I think tests fall into, Unit, Integration and Functional.

Of course, regardless of classification, all tests are only valuable if they are actually being executed. Its wonderful to say you have tests, but if you’re not running them all the time, and actually looking at the results, they are worthless. Worse than worthless if you think about it, because the presence of tests gives a false sense of security about your system.

Typically executing Unit Tests (and Integration Tests if they are using the same framework) is trivial, made vasty easier by having a build server. Its not that bad even if you don’t have a build server, because those sorts of tests can typically be run on a developers machine, without a huge amount of fanfare. The downside of not having a build server, is that the developers in question need to remember to run the tests. As creative people, following a checklist that includes “wait for tests to run” is sometimes not our strongest quality.

Note that I’m not saying developers should not be running tests on their own machines, because they definitely should be. I would usually limit this to Unit tests though, or very self-contained Integration tests. You need to be very careful about complicating the process of actually writing and committing code if you want to produce features and improvements in a reasonable amount of time. Its very helpful to encourage people to run the tests themselves regularly, but to also have a fallback position. Just in case.

Compared to running Unit and Integration tests, Functional tests are a different story. Regardless of your software, you’ll want to run your Functional tests in a controlled environment, and this usually involves spinning up virtual machines, installing software, configuring the software and so on. To get good test results, and to lower the risk that the results have been corrupted by previously test runs, you’ll want to use a clean environment each time you run the tests. Setting up and running the tests then becomes a time consuming and boring task, something that developers hate.

What happens when you give a developer a task to do that is time consuming and boring?

Automation happens.

Procedural Logic

Before you start doing anything, its helpful to have a high level overview of what you want to accomplish.

At a high level, the automated execution of functional tests needed to:

Set up a test environment.
- Spin up a fresh virtual machine.
- Install the software under test.
- Configure software under test.
Execute the functional tests.
Report the results.

Fairly straightforward. As with everything related to software though, the devil is in the details.

For anyone who doesn’t want to listen to me blather, here is a link to a GitHub repository containing sanitized versions of the scripts. Note that the scripts were not complete at the time of this post, but will be completed later.

Now, on to the blather!

Automatic Weapons

In order to automate any of the above, I would need to select a scripting language.

It would need to be able to do just about anything (which is true of most scripting languages), but would also have to be able to allow me to remotely execute a script on a machine without having to log onto it or use the UI in any way.

I’ve been doing a lot of work with Powershell recently, mostly using it to automate build, package and publish processes. I’d hesitated to learn Powershell for a long time, because every time I encountered something that I thought would have been made easier by using Powershell, I realised I would have to spend a significant amount of time learning just the basics of Powershell before I could do anything useful. I finally bit the bullet and did just that, and its snowballed from there.

Powershell is the hammer and everything is a nail now.

Obviously being a well established scripting language and is installed on basically every modern version of Windows. Powerful by itself, it’s integration with the .NET framework allows a C# developer like me the power to fall back to the familiar .NET BCL for anything I can’t accomplish using just Powershell and its cmdlets. Finally, Powershell Remote Execution allows you to configure a machine and allow authenticated users to remotely execute scripts on it.

So, Powershell it was.

A little bit more about Powershell Remote Execution. It leverages the Windows Remoting Framework (WinRM), and once you’ve got all the bits and pieces setup on the target machine, is very easy to use.

A couple of things to be aware of with remote execution:

By default the Windows Remoting Service is not enabled on some versions of Windows. Obviously this needs to be running.
Powershell Remote Execution communicates over port 5985 (HTTP) and 5986 (HTTPS). Earlier versions used 80 and 443. These ports need to be configured in the Firewall on the machine in question.
The user you are planning on using for the remote execution (and I highly suggest using a brand new user just for this purpose) needs to be a member of the [GROUP HERE] group.

Once you’ve sorted the things above, actually remotely executing a script can be accomplish using the Invoke-Command cmdlet, like so:

$pw = ConvertTo-SecureString '[REMOTE USER PASSWORD' -AsPlainText -Force
$cred = New-Object System.Management.Automation.PSCredential('[REMOTE USERNAME]', $pw)
$session = New-PSSession -ComputerName $ipaddress -Credential $cred 

write-host "Beginning remote execution on [$ipaddress]."

$testResult = Invoke-Command -Session $session -FilePath "$root\remote-download-files-and-run-functional-tests.ps1" -ArgumentList $awsKey, $awsSecret, $awsRegion, $awsBucket, $buildIdentifier

Notice that I don’t have to use a machine name at all. IP Addresses work fine in the ComputerName parameter. How do I know the IP address? That information is retrieved when starting the Amazon EC2 instance.

Environmental Concerns

In order to execute the functional tests, I wanted to be able to create a brand new, clean virtual machine without any human interaction. As I’ve stated previously, we primarily use Amazon EC2 for our virtualisation needs.

The creation of a virtual machine for functional testing would need to be done from another AWS EC2 instance, the one running the TeamCity build agent. The idea being that the build agent instance is responsible for building the software/installer, and would in turn farm out the execution of the functional tests to a completely different machine, to keep a good separation of concerns.

Amazon supplies two methods of interacting with AWS EC2 (Elastic Compute Cloud) via Powershell on a Windows machine.

The first is a set of cmdlets (Get-EC2Instance, New-EC2Instance, etc).

The second is the classes available in the .NET SDK for AWS.

The upside of running on an EC2 instance that was based off an Amazon supplied image is that both of those methods are already installed, so I didn’t have to mess around with any dependencies.

I ended up using a combination of both (cmdlets and .NET SDK objects) to get an instance up and running, mostly because the cmdlets didn’t expose all of the functionality that I needed.

There were 3 distinct parts to using Amazon EC2 for the test environment. Creation, Configuration and Waiting and Clean Up. All of these needed to be automated.

Creation

Obviously an instance needs to be created. The reason this part is split from the Configuration and Waiting is because I’m still not all that accomplished at error handling and returning values in Powershell. Originally I had creation and configuration/waiting in the same script, but if the call to New-EC2Instance returned successfully and then something else failed, I had a hard time returning the instance information in order to terminate it in the finally block of the wrapping script.

The full content of the creation script is available at create-new-ec2-instance.ps1. Its called from the main script (functional-tests.ps1).

Configuration and Waiting

Beyond the configuration done as part of creation, instances can be tagged to add additional information. Also, the script needs to wait on a number of important indicators to ensure that the instance is ready to be interacted with. It made sense to do these two things together for reasons.

The tags help to identify the instance (the name) and also mark the instance as being acceptable to be terminated as part of a scheduled cleanup script that runs over all of our EC2 instances in order to ensure we don’t run expensive instances longer than we expected to.

As for the waiting indicators, the first indicator is whether or not the instance is running. This is an easy one, as the state of the instance is very easy to get at. You can see the function below, but all it does is poll the instance every 5 seconds to check whether or not it has entered the desired state yet.

The second indicator is a bit harder to get at, but it actually much more important. EC2 instances can be configured with status checks, and one of those status checks is whether or not the instance is actually reachable. I’m honestly not sure if this is something that someone before me setup, or if it is standard on all EC2 instances, but its extremely useful.

Anyway, accessing this status check is a bit of a rabbit hole. You can see the function below, but it uses a similar approach to the running check. It polls some information about the instance every 5 seconds until it meets certain criteria. This is the one spot in the entire script that I had to use the .NET SDK classes, as I couldn’t find a way to get this information out of a cmdlet.

The full content of the configuration and wait script is available at tag-and-wait-for-ec2-instance.ps1, and is just called from the main script.

Clean Up

Since you don’t want to leave instances hanging around, burning money, the script needs to clean up after it was done.

Programmatically terminating an instance is quite easy, but I had a lot of issues around the robustness of the script itself, as I couldn’t quite grasp the correct path to ensure that a clean up was always run if an instance was successfully created. The solution to this was to split the creation and tag/wait into different scripts, to ensure that if creation finished it would always return identifying information about the instance for clean up.

Termination happens in the finally block of the main script (functional-tests.ps1).

Instant Machine

Of course all of the instance creation above is dependent on actually having an AMI (Amazon Machine Image) available that holds all of the baseline information about the instance to be created, as well as other things like VPC (Virtual Private Cloud, basically how the instance fits into a network) and security groups (for defining port accessibility). I’d already gone through this process last time I was playing with EC2 instances, so it was just a matter of identifying the various bits and pieces that needs to be done on the machine in order to make it work, while keeping it as clean as possible in order to get good test results.

I went through the image creation process a lot as I evolved the automation script. One thing I found to be useful was to create a change log for the machine in question (I used a page in Confluence) and to version any images made. This helped me to keep the whole process repeatable, as well as documenting the requirements of a machine able to perform the functional tests.

To Be Continued

I think that’s probably enough for now, so next time I’ll continue and explain about automating the installation of the software under test and then actually running the tests and reporting the results.

Until next time!

Classifying Automated Tests

December 2. 2014 0 Comments

Ahhhh automated tests. I first encountered the concept of automated tests 6-7 years ago via a colleague experimenting with NUnit. I wasn’t overly impressed at first. After all, your code should just work, you shouldn’t need to prove it. Its safe to say I was a bad developer.

Luckily logic prevailed, and I soon came to accept the necessity of writing tests to improve the quality of a piece of software. Its like double-entry book keeping, the tests provide checks and balances for your code, giving you more than one indicator as to whether or not it is working as expected.

Notice that I didn’t say that they prove your code is doing what it is supposed to. In the end tests are still written by a development team, and the team can still misunderstand what is actually required. They aren’t some magical silver bullet that solves all of your problems, they are just another tool in the tool box, albeit a particularly useful one.

Be careful when writing your tests. Its very easily to write tests that actually end up making your code less able to respond to change. It can be very disheartening to go to change the signature of a constructor and to hit hundreds of compiler errors because someone helpfully wrote 349 tests that all use the constructor directly. I’ve written about this specific issue before, but in more general terms you need to be very careful about writing tests that hurt your codebase instead of helping it.

I’m going to assume that you are writing tests. If not, you’re probably doing it wrong. Unit tests are a good place to start for most developers, and I recommend The Art of Unit Testing by Roy Osherove.

I like to classify my tests into 3 categories. Unit, Integration and Functional.

Unit

Unit tests are isolationist, kind of like a paranoid survivalist. They don’t rely on anyone or anything, only themselves. They should be able to be run without instantiating any class but themselves, and should be very fast. They tend to exercise specific pieces of functionality, often at a very low level, although they can also encompass verifying business logic. This is less likely though, as business logic typically involves multiple classes working together to accomplish a higher level goal.

Unit tests are the lowest value tests for verifying that your piece of software works from an end-users point of view, purely because of their isolationist stance. Its easily plausible to have an entire suite of hundreds of unit tests passing and still have a completely broken application (its unlikely though).

Their true value comes from their speed and their specificity.

Typically I run my unit tests all the time, as part of a CI (Continuous Integration) environment, which is only possible if they run quickly, to tighten the feedback loop. Additionally, if a unit test fails, the failure should be specific enough that it is obvious why the failure occurred (and where it occurred).

I like to write my unit tests in the Visual Studio testing framework, augmented by FluentAssertions (to make assertions clearer), NSubstitute (for mocking purposes) and Ninject (to avoid creating a hard dependency on constructors, as previously described).

Integration

Integration tests involve multiple components working in tandem.

Typically I write integration tests to run at a level just below the User Interface and make them purely programmatic. They should walk through a typical user interaction, focusing on accomplishing some goal, and then checking that the goal was appropriately accomplished (i.e. changes were made or whatnot).

I prefer integration tests to not have external dependencies (like databases) but sometimes that isn’t possible (you don’t want to mock an entire API for example) so its best if they operate in a fashion that isn’t reliant on external state.

This means that if you’re talking to an API for example, you should be creating, modifying and deleting appropriate records for your tests within the tests themselves. The same can be said for a database, create the bits you want, clean up after yourself.

Integration tests are great for indicating whether or not multiple components are working together as expected, and for verifying that at whatever programmable level you have introduced the user can accomplish their desired goals.

Often integration tests like I have described above are incredibly difficult to write on a system that does not already have them. This is because you need to accommodate the necessary programmability layer into the system design for the tests. This layer has to exist because historically programmatically executing most UI layers has proven to be problematic at best (and impossible at worst).

The downside is that they are typically much, much slower than unit tests, especially if they are dependent on external resources. You wouldn’t want to run them as part of your CI, but you definitely want to run them regularly (at least nightly, but I like midday and midnight) and before every release candidate.

I like to write my Integration tests in the same testing framework as my unit tests, still using FluentAssertions and Ninject, with as little usage of NSubstitute as possible.

Functional

Functional tests are very much like integration tests but they habe one key difference, they execute on top of whatever layer the user typically interacts with. Whether that is some user interface framework (WinForms, WPF) or a programmatically accessible API (like ASP.NET Web API), the tests focus on automating normal user actions as the user would typically perform them, with the assistance of some automation framework.

I’ll be honest, I’ve had the least luck with implementing these sorts of tests, because the technologies that I’ve personally used the most (CodedUI) have proven to be extremely unreliable. Functional tests written on top of a public facing programmable layer (like an API) I’ve had a lot more luck with, unsurprisingly.

The worst outcome of a set of tests are regular, unpredictable failures that have no bearing on whether or not the application is actually working from the point of view of the user. Changing the names of things or just text displayed on the screen can lead to all sorts of failures in automated functional tests. You have to be very careful to use automation friendly meta information (like automation IDs) and to make sure that those pieces of information don’t change without good reason.

Finally, managing automated functional tests can be a chore, as they are often quite complicated. You need to manage this code (and it is code, so it needs to be treated like a first class citizen) as well, if not better than your actual application code. Probably better, because if you let it atrophy, it will very quickly become useless.

Regardless, functional tests can provide some amount of confidence that your application is actually working and can be used. Once implemented (and maintained) they are far more repeatable than someone performing a set of steps manually.

Don’t think that I think manual testers are not useful in a software development team. Quite the contrary. I think that they should be spending their time and applying their experience to more worthwhile problems, like exploratory testing as opposed to simply being robots following a script. That's why we have computers after all.

I have in the past used CodedUI to write functional tests for desktop applications, but I can’t recommend it. I’ve very recently started using TestComplete, and it seems to be quite good. I’ve heard good things about Selenium, but have never used it myself.

Naming

Your tests should be named clearly. The name should communicate the situation and the expected outcome.

For unit tests I like to use the following convention:

[CLASS_NAME]_[CLASS_COMPONENT]_[DESCRIPTION_OF_TEST]

An example of this would be:

DefaultConfigureUsersViewModel_RegisterUserCommand_WhenNewRegisteredUsernameIsEmptyCommandIsDisabled

I like to use the class name and class component so that you can easily see exactly where the test is. This is important when you are viewing test results in an environment that doesn't support grouping or sorting (like in the text output from your tests on a build server or in an email or something).

The description should be easily readable, and should confer to the reader an indication of the situation (When X) and the expected outcome.

For integration tests I tend to use the following convention:

I_[FEATURE]_[DESCRIPTION_OF_TEST]

An example of this would be:

I_UserManagement_EndUserCanEnterTheDetailsOfAUserOfTheSystemAndRegisterThemForUseInTheRestOfTheApplication

As I tend to write my integration tests using the same test framework as the unit tests, the prefix is handy to tell them apart at a glance.

Functional tests are very similar to integration tests, but as they tend to be written in a different framework the prefix isn't necessary. As long as they have a good, clear description.

There are other things you can do to classify tests, including using the [TestCategory] attribute (in MSTest at least), but I find good naming to be more useful than anything else.

Organisation

My experience is mostly relegated to C# and the .NET framework (with bits and pieces of other things), so when I speak of organisation, I’m talking primarily about solution/project structures in Visual Studio.

I like to break my tests into at least 3 different projects.

[COMPONENT].Tests
[COMPONENT].Tests.Unit
[COMPONENT].Tests.Integration

The root tests project is to contain any common test utilities or other helpers that are used by the other two projects, which should be self explanatory.

Functional tests tend to be written in a different frameowkr/IDE altogether, but if you’re using the same language/IDE, the naming convention to follow for the functional tests should be obvious.

Within the projects its important to name your test classes to match up with your actual classes, at least for unit tests. Each unit test class should be named the same as the actual class being tested, with a suffix of UnitTests. I like to do a similar thing with IntegrationTests, except the name of the class is replaced with the name of the feature (i.e. UserManagementIntegrationTests). I find that a lot of the time integration tests tend to

Tying it All Together

Testing of one of the most powerful tools in your arsenal, having a major impact on the quality of your code. And yet, I find that people don’t tend to give it a lot of thought.

The artefacts created for testing should be treated with the same amount of care and thoughtfulness as the code that is being tested. This includes things like having a clear understanding of the purpose and classification of a test, naming and structure/position.

I know that most of the above seems a little pedantic, but I think that having a clear convention to follow is important so that developers can focus their creative energies on the important things, like solving problems specific to your domain. If you know where to put something and approximately what it looks like, you reduce the cognitive load in writing tests, which in turn makes them easier to write.

I like it when things get easier.

Champion your Code

October 7. 2014 0 Comments

Fellow developers, this is a call to arms! Too often I see developers who immediately cave to external pressure to “just get it done”. “Just” is a dangerous word when it comes to software development anyway. Don’t do it! Stand up for your code, don’t compromise on quality just because someone is breathing down your neck. Take a few moments to think about the impact of what you are being asked to do, both to you (as the current developer) and to others (mostly future developers).

No-one wants to work inside crap code, but people seem to be more than happy enough to create it under pressure. Sure you might think that you’ll just fix it up next time, but I’ve found that that hardly ever happens.

Do you think that overbearing project manager is ever going to have to suffer through maintaining or refactoring code that “there wasn’t enough time to test” or that is unclear as a result of “we need to just get this done now, we’ll refactor it later”? Nope. The project managers job is to get the project to some previously decided definition of done and they may be more than willing to sacrifice fuzzy things like “code quality”, “readability” and sometimes even “test coverage” or “scalability” in exchange for things that they are directly measured against like “deadlines” and “fixed scope”.

The representatives of the business will fight hard for what they believe is important. As a developer you should fight just as hard for the code. If the business representative is fighting for the “just get it done” point of view, you should fight just as hard for quality, readability, maintainability, cleanliness, and all of those other things that good code has. Stand up for what you believe in and make sure that you’re creating code that is pleasant to work with, otherwise you're going to start hating your job, and nobody wants that to happen.

I’m not prescribing that you just do what you want. That’s silly. You can’t just spend 3 months coming up with the most perfect architecture for solving problem X, or you might be surprised when you come up for air just in time to get made redundant because the organisation doesn’t have any money left.

What I am prescribing is that you fight just as hard for the code as other people fight for other things (like deadlines, estimates, contract sign-off, scope negotiation, etc).

As far as I can see there are two approaches to championing your code.

Communicate

Never attribute to malice that which is adequately explained by stupidity.
Robert J Hanlon

You (as a professional developer) know why quality is important, but other people in the organization might not. They might not realise the impact of what they are (either implicitly or explicitly) asking you to do.

You need to communicate the importance of taking the time to create quality to the people who can actually make decisions. Of course, you need to be able to talk to those people in the language that they understand, so you need to be able to speak about things like “business impact”, “cost of ownership” and “employee productivity”. Its generally not a good idea to go to someone in Senior Management (even if they are technical) and start ranting about how the names of all of your classes and methods look like a foreign language because they are so terrible.

If your immediate superior doesn’t care, then go over their head until you reach someone who does. If you don’t find someone who cares, look for a better job.

Obviously be careful when doing this, as you could land yourself in some very awkward situations. At the very least, always be aware that you should show as much loyalty to your organisation as they would show you.

Do It Anyway

The conventional army loses if it does not win. The guerrilla wins if he does not lose.
Henry A Kissinger

If you feel like you are trying to communicate with a brick wall when it comes to talking about quality, you might need to engage in some “guerrilla craftsmanship”. Don’t explain exactly what you are doing in the code, and keep following good development practices. If you’re asked for estimates, include time for quality. Get buy-in from the rest of your fellow developers, create a shared coding standard. Implement code reviews. Ensure that code is appropriately covered by tests. To anyone outside the team, this is just what the team is doing. There’s no negotiation, you aren’t asking to develop quality solutions, you’re just doing it.

When that external pressure comes around again, and someone tries to get you to “just get it done”, resist. Don’t offer any additional information, this is just how long its going to take. If you are consistent with your approach to quality, you will start delivering regularly, and that external pressure will dissipate.

Be mindful, this is a dangerous route, especially if you are facing a micro-manager or someone who does not trust developers. As bad as it sounds, you may need to find someone to “distract” or “manage” that person, kind of like dangling some keys in front of a baby, while you do what you need to do.

Conclusion

Fight for your code.

No one else will.