Reflecting Without Mirrors

April 12. 2016 0 Comments

Posted in:
agile
tutoring
qut

As a result of being a tutor at QUT (Agile Project Management, IAB304 (undergraduate), IFN700 (post-graduate)), I get some perks. Nothing as useful as the ability to teleport at will or free candy, but I do occasionally get the opportunity to do things I would not normally do. Most recently, they offered me a place in a course to increase my teaching effectiveness. Honestly, its a pretty good idea, because better teachers = happier, smarter students, which I’m sure ends up in QUT getting more money at some point. Anyway, the course is called “Foundations of Learning and Teaching” and its been run sparsely over the last month or two (one Monday for 3 hours here, another Monday there, etc).

As you would expect from a University course, there is a piece of assessment.

I’m going to use this blog post to kill two problems with one idea, mostly because killing birds seems unnecessary (and I have bad aim with rocks, so it would be more like “breaking my own windows and then paying for them”). It will function as a mechanism and record for completing the assessment and as my weekly blog post. Efficient.

Anyway, the assessment was to come up with some sort of idea to provide support to students/increase engagement/building a learning community, within my teaching context (so my tutorial).

I’ve cheated at least a little bit here, because technically I was already doing something to increase engagement, but I’ll be damned ifI’m not going to use it just because I thought of it before doing the course.

We do retrospectives at the end of each workshop.

Mirror Magic

If you’ve ever had anything to do with Agile (Scrum or any other framework), you will likely be very familiar with the concept of retrospectives. Part of being agile is to make time for continual improvement,so that you’re getting at least a little bit better all the time. One of the standard mechanisms for doing this is to put aside some time at the end of every sprint/iteration to think about how everything went and what could be improved.

I’ve been practicing agile concepts for a while now, so the concept is pretty ingrained into most things I do, but I still find it very useful for capping off any major effort and helping to focus in on ways to get better at whatever you want.

In the context of the workshops at QUT, I treat each workshop as a “sprint”. They start with a short planning session, sometimes feature the grooming of future workshop content in the middle and always end with a retrospective.

While I think the whole picture (running the workshops as if they were sprints) is useful, I’m just going to zero in on the retrospective part, specifically as a mechanism for both increasing engagement and for building a community that treats self-improvement as a normal part of working.

The real meat of the idea is to encourage critical thinking beyond the course material. Each workshop is always filled with all sorts of intellectual activity, but none of it is focused around the process of learning itself. By adding a piece of dedicated time to the end of every workshop, and facilitating the analysis of the process that we just shared as a group, the context is switched from one focused purely on learning new concepts, to how the learning process itself went.

Are Reflections Backwards…or Are We?

But what exactly is a retrospective?

To be honest, there is no one true way to run a retrospective, and if you try to run them the same way all the time, everyone will just get tired of doing them. They become stale, boring and generally lose their effectiveness very quickly. Try to switch it up regularly, to keep it fresh and interesting for everyone involved (including you!).

Anyway, the goal is simply to facilitate reflective discussions, and any mechanism to do that is acceptable. In fact, the more unusual the mechanism (as long as its understandable), the better the results are likely to be, because it will take people out of their comfort zone and encourage them to think in new and different ways.

To rebound somewhat from the effectively infinite space of “anything is a retrospective!”, I’m going to outline two specific approaches that can be used to facilitate the process.

The first is very stock standard, and relies of bucketing points into 3 distinct categories, what went well, what could we do better and any open questions.

The second is more visual, and involves drawing a chart of milestones and overall happiness during the iteration.

There’s a Hole In The Bucket

The buckets approach is possibly the most common approach to retrospectives, even if the names of the buckets change constantly.

The first bucket (what went well) is focused on celebrating successes. Its important to begin by trying to engage with everyone involved on the victories that were just achieved, because otherwise retrospectives can become very negative very quickly. This is a result of most people naturally focusing on the bad things that they would like to see fixed. In terms of self improvement, the results of this question provide reinforcement for anything currently being done (either a new idea as a result of a previous retrospective or because it was always done).

The second bucket (what could we do better) is focused on stopping or redirecting behaviours that are not helping. You will often find the most feedback here, for the same reason I mentioned above (focusing on negatives as improvement points), so don’t get discouraged if there is 1 point in the first bucket and then 10 in the second. This is where you can get into some extremely useful discussion points, assuming everyone is engaged in the process. Putting aside ego is important here, as it can be very easy for people to accidentally switch into an accusatory frame of mind (“Everything I did was great, but Bob broke everything”), so you have to be careful to steer the discussion into a productive direction.

The final bucket (any open questions) is really just for anything that doesn’t fit into the first two buckets. It allows for the recording of absolutely anything that anyone has any thoughts about, whether it be open questions (“I don’t understand X, please explain”) or anything else that might be relevant.

After facilitating discussion of any points that fit into the buckets above, the final step is to determine at least one action for the next iteration. Actions can be anything, but they should be related to one of the points discussed in the first part of the retrospective. They can be simple (“please write bigger on the whiteboard”) or complex (“we should use a random approach for presenting the results of our activities”), it really doesn’t matter. Actions are a concrete way to accomplish the goal of self-improvement (especially because they should have an owner who is responsible for making sure they occur), but even having a reflective discussion can be enough to increase engagement and encourage improvement.

There’s No Emoticon For What I’m Feeling!

The visual approach is an interesting one, and speaks to people who are more visually or feelings oriented, which can be useful as a mechanism of making sure everyone is engaged. Honestly, you’ll never be able to engage everyone at the same time, but if you keep changing the way you approach the retrospective, you will at least be able to engage different sections of the audience at different times, increasing the total amount of engagement.

It’s simple enough. Draw a chart with two axes, the Y-axis representing happiness (sad, neutral, happy) and the X-axis representing time.

Canvas the audience to identify milestones within the time period (to align everyone), annotate the X-axis with those milestones and then get everyone to draw a line that represents their level of happiness during the time period.

As people are drawing their lines, they will identify things that made them happy or sad (in a relatively organic fashion), which should act as triggers for conversation.

At the end, its ideal to think about some actions that could be taken to improve the overall level happiness, similar to the actions that come from the bucket approach.

Am I Pushing Against The Mirror, Or Is It Pushing Against Me?

Running retrospectives as part of workshops is not all puppies and roses though.

A retrospective is most effective when the group is small (5-9 people). In a classroom of 50+ students, there are just too many people to facilitate. That is not to say that you won’t still get some benefit from the process, its just much harder to get a high level of engagement across the entire audience when there are so many people. In particular, the visual approach I outlined above is almost impossible to do if you want everyone to participate.

One mechanism for dealing with this is to break the entire room into groups, such that you have as many groups as you normally would individuals. This can make the process more manageable, but does decrease individual participation, which is a shame.

Another problem that I’ve personally experienced is that the positioning of the retrospective at the end of the workshop can sometimes prove to be its undoing. As time progresses, and freedom draws closer, it can become harder and harder to maintain focus in a classroom. In a normal agile environment where retrospectives bookend iterations (i.e. the next iteration starts shortly after the previous one ends and the retrospective occurs at that boundary), and where there is no appreciable delay between one iteration and the next, this is not as much of a problem (although running a retrospective from 4-5 on a Friday is damn near impossible, even in a work environment). When there is at least a week between iterations, like there is with workshops, it can be very hard to get a good retrospective going.

Last but not least, it can be very hard to get a decent retrospective accomplished in a short amount of time, and I can’t afford to allocate too much during the workshop.

When running a two week iteration, its very normal to put aside a full hour for the retrospective. Even then, this is a relatively small amount of time, and retrospectives are often at risk of running over (aggressive timeboxing is a must). When running a workshop of 2 hours, I can only realistically dedicate 5-10 minutes for the retrospective. It can be very hard to get everyone in the right mindset to get a good discussion going with this extremely limited amount of time, especially when combined with the previous point (lack of focus dur to impending freedom).

Aziz Light!

You can see some simple retrospective results in the image gallery below.

IAB304 - S1 2016 - Retrospectives

The first image is actually not related to retrospectives at all, and is the social contract that the class came up with during the very first week (to baseline our interactions and provide a reference point for when things are going poorly), but the remainder of the pictures show snapshots of the board after the end of every workshop so far.

What the pictures don’t show is the conversations that happened as a result of the retrospective, which were far more valuable than anything being written. It doesn’t help that I have a natural tendency to not focus on documentation, and to instead focus on the people and interactions, so there were a lot of things happening that just aren’t recorded in those photos.

I think the retrospectives really help to increase the amount of engagement the students have with the teaching process, and really drive home the point that they have real power in changing the way that things happen, in an immediately visible way.

And as we all know, with great power, comes great responsibility.

The Mysterious Session 0

April 5. 2016 0 Comments

Back in the day (pre Windows Vista, so we’re talking about Windows XP and Windows Server 2003), it was possible for a user and system services to share the same logical space. Depending on what your login settings were, the first user to login to a system was likely to be automatically assigned to session 0, which is where services and other system processes would run.

This was both a blessing and a curse.

It was a blessing because now the user would be able to see if a service decided to display a dialog or some other incredibly stupid mechanism to try and communicate.

It was a curse because now the user was running in the same space as system critical services, enabling a particularly dangerous attack vector for viruses and malicious code.

In Windows Vista this was changed, with the advent of Session 0 isolation. In short, services would now run in their own session (session 0) and whenever a user logged in they would automatically be assigned to sessions greater than 0.

The entire post should be considered with a caveat. I am certainly no expert in Windows session management, so while I’ve tried my best to understand the concepts at play, I cannot guarantee their technical correctness. I do know that the solution I outline later does allow for a workaround for the application I was working with, so I hope that will prove useful at least.

Vista was many years ago though, so you might be asking what relevance any of this has now?

Well, applications have a tendency to live long past when anyone expects them to, and old applications in particular have a tendency to accumulate cruft over the years.

I was working with one such application recently.

Denied

Most of the time, my preference is to work with virtual machines, especially when investigating or demonstrating software. Its just much easier to work in a virtual environment that can easily be reset to a known state.

I mostly use Virtual Box, but that’s just because its the virtualisation tool I am most familiar with. Virtual Box is all well and good, but it does make it very hard to collaborate with other people, especially considering the size of the virtual machines themselves (Windows is much worse than Linux). Its hard to pass the the virtual machine around, and its beyond most people to expose a virtual machine to the greater internet so someone in a different part of the world can access it.

As a result I’ve gravitated towards AWS for demonstration machines

AWS is not perfect (its hard to get older OS versions setup for example, which limits its usage for testing things that require a certain OS) but for centralizing demonstration machines its a godsend.

How does all of this relate to session 0 and old applications?

Well, I recently setup an EC2 instance in AWS to demonstrate to stakeholders some work that we’d been doing. In order to demonstrate some new functionality in our product, I needed to configure a third-party product in a particular way. I had done this a number of times before on local virtual machines, so imagine my surprise when I was confronted with an error message stating that I was not allowed to configure this particular setting when not logged in as a console user.

Console?

To most users, I would imagine that that error message is incredibly unhelpful.

Well, this is where everything ties back into session 0, because in the past, if you wanted to remote into a machine, and be sure that you were seeing the same thing each time you logged in, you would use the following command:

mstsc /console

This would put you into session 0, which is usually the same session as you would see when physically accessing the server, i.e. it was as if you were viewing the monitor/keyboard/mouse physically connected to the box. More importantly, it also let you interact with services that insisted on trying to communicate with the user through dialogs or SendMessage.

The consistent usage of the console switch could be used to prevent issues like Bob logging in and starting an application server, then Mary also logging in and doing the same. Without the /console switch, both would log into their own sessions, even if they were using the same user, and start duplicate copies of the application.

Being familiar with the concept (sometimes experience has its uses), I recognised the real meaning of the “you are not logged in as the console” message. It meant that it had detected that I was not in session 0 and it need to do something that requires communication to a service via outdated mechanisms. Disappointing, but the application has been around for a while, so I can’t be too mad.

Unfortunately, the console switch does not give access to session 0 anymore. At least not since the introduction of session 0 isolation in Vista. There is an /admin switch, but it has slightly different behaviour (its really only for getting access to the physical keyboard/screen, so not relevant in this situation).

Good Old Sysinternals

After scouring the internet for a while, I discovered a few things that were new to me.

The first was that when Microsoft introduced session 0 isolation they did not just screw over older applications. Microsoft is (mostly) good like this.

In the case of services that rely on interacting with the user through GUI components (dialogs, SendMessage, etc), you can enable the Interactive Services Detection Service (ui0detect). Once this service is enabled and running, whenever a service attempts to show a dialog or similar, a prompt will show up for the logged in user, allowing them to switch to the application.

The second was that you can actually run any old application you like in session 0, assuming you have administrator access to the machine.

This is where Sysinternals comes to the rescue yet again (seriously, these tools have saved me so many times, the authors may very well be literal angels for all I know).

Using psexec, you can start an application inside session 0.

psexec –s 0 –i {path to application}

You’ll need to be running as an Administrator (obviously), but assuming you have the Interactive Services Detection Service running, you should immediately receive a prompt that says something like “an application is trying to communicate with you”, which you can then use to switch to the GUI of the application running in session 0.

With this new power it was a fairly simple matter to start the application within session 0, fooling whatever check it had, which allowed me to change the setting and demonstrate our software as needed.

Conclusion

As I mentioned earlier, software has an unfortunate tendency to live for far longer than you think it should.

I doubt the person who wrote the console/session 0 check inside the application expected someone to be installing and running it inside a virtual machine hosted purely remotely in AWS. In fact, when the check was written, I doubt AWS was even a glimmer in Chris Pinkham’s eye. I’m sure the developer had a very good reason for the check (it prevented a bug or it allowed a solution that cost 1/4 as much to implement), and they couldn’t have possibly anticipated the way technology would change in the future.

Sometimes I worry that for all the thought we do put into software, and all the effort we put into making sure that it will do what it needs to do as long as it needs to, its all somewhat pointless. We cannot possibly anticipate shifts in technology or users, so really the only reasonable approach is to try and make sure we can change anything with confidence.

Honestly, I’m surprised most software works at all, let alone works mostly as expected decades later.

Everyone Loves Lego: Part 2

March 29. 2016 0 Comments

Posted in:
tutoring
scrum
agile

In Part 1 of this two parter, I outlined the premise of Scrum City and some of the planning required (in the context of running the game as part of the tutoring work I do for QUT’s Agile Project Management course). Go read that if you want more information, but at a high level Scrum City is an education game that demonstrates many of the principles and practices of Scrum.

The last post outlined the beginning of the activity:

Communicating the vision, i.e. “I want a city! In lego! Make it awesome!”
Eliciting requirements, i.e “What features do you want in your model?”
Estimation, i.e. “Just how much will this cost me?”
Prioritization, i.e. “I want this before this, but after this”

With the above preparation out of the way, all that’s left is the delivery, and as everyone already knows, delivery is the easiest part...

Executions Can Be Fun

At this point, each team participating in the game (for me that was the two opposing sides of the classroom), should have a backlog. Each backlog item (some feature of the city) should have appropriate acceptance criteria, an estimate (in story points) and some measure of how important it is in the scheme of things. If the team is bigger than 5 people (mine were, each had about 25 people in it), you should also break them down into sub-teams of around 5 people (it will make it easier for the teams to organise their work).

Scrum City consists of 3 iterations of around 20 minutes each. For instructive purposes, the first iteration usually goes for around 30 minutes, so that you have time to explain various concepts (like planning and retrospectives) and to reinforced certain outcomes.

Each iteration consists of 3 phases; planning, execution and retrospective, of length 5, 10 and 5 minutes respectively. Make it very clear that no resources are to be touched during planning/retrospective and that work not present in the delivery area (i.e. the city) at the end of an iteration will not be accepted.

All told, you can easily fit the second half of Scrum City into a 90 minute block (leaving some time at the end for discussion and questions).

Obviously you will also need some physical supplies like lego, coloured pens/pencils and paper. Its hard to measure lego, so just use your best guess as to how much you’ll need, then double it. It doesn’t hurt to have too much lego, but it will definitely hurt to have too little. A single pack of coloured pens/pencils, 10 sheets of A4 paper and a single piece of A2 will round out the remaining resources required.

Don’t Fire Until You See the Whites of Their Eyes

Each iteration should start with a short planning session.

The goal here is to get each team to quickly put their thoughts together regarding an approximate delivery plan, allocating their backlog over the course of the available iterations. Obviously, 5 minutes isn’t a lot of time, so make sure they focus on the impending iteration. This is where you should reinforce which items are the most important to you (as the product owner) so that they get scheduled sooner rather than later.

Of course, if the team insists on scheduling something you don’t think is important yet, then feel free to let them do it and then mercilessly reject it come delivery time. Its pretty fun, I suggest trying it at least once.

With priorities and estimates in place from the preparation, planning should be a relatively smooth process.

After each team has finished, note down their expected velocity, and move onto the part where they get to play with legos (I assure you, if people are even the tiniest bit engaged, they will be chomping at the bit for this part).

Everything Is Awesome!

The execution section of each iteration should be fast and furious. Leave the teams to their own devices, and let them focus on delivering their committed work (just like a real iteration).

You will likely have to field questions at this point, as it is unlikely the acceptance criteria will be complete (which is fine, remember stories are invitations to conversations, not detailed specifications).

For iterations past the first, you should complicate the lives of the teams by introducing impediments, like:

In iteration two you should introduce an emergency ticket. You’ve decided that it is a deal breaker that the city does not have a prison (or other building), and you require some capacity from the team to prepare the story. You can then require that this ticket be completed during the next iteration.
- Interestingly enough, the scrum master of one of my teams intercepted and redirected my attempt to interrupt one of his sub-teams with the prison story. It was very well done and a good example of what I would expect from a scrum master.
In iteration three, you should pick some part of a teams capacity (pick the most productive one) has all come down with the bubonic plague and they will not be available for the remainder of this iteration.

Once the team runs out of time (remember, 10 minutes, no extensions, tightly timeboxed), its time for a showcase.

That means its time for you, as product owner, to accept or reject the work done.

This is a good opportunity to be brutal on those people who did not ask enough questions to get to the real meat of what makes a feature acceptable. For example, if no-one asked whether or not a one storey house had to be all one colour, you can brutally dismiss any and all houses that don’t meet that criteria.

The first iteration is an extremely good time to really drive the importance of acceptance criteria home.

Once you’ve accepted some subset of the work delivered, record the velocity and explain the concept so that the teams can use it in their next planning session. Most groups will deliver much less than they committed to in the first iteration, because they overestimate their ability/underestimate the complexity of the tickets, especially if you are particularly brutal when rejecting features that don’t meet your internal, unstated criteria.

Think Real Hard

After the intense and generally exhausting execution part of each iteration, its time for each team to sit back and have a think about how they did. Retrospectives are an integral part of Scrum, and a common part of any agile process (reflection leading to self-improvement and all that).

Being that each team will only have around 5 minutes to analyse their behaviour and come up with some ways to improve, it may be helpful to provide some guided questions, aimed at trying to get a single improvement out of the process (or a single thing to stop doing because it hurt their performance).

Some suggestions I’ve seen in the past:

Instead of waiting until the end to try and delivery the features, deliver then constantly throughout the iteration.
Allocate some people to organising lego into coloured piles, to streamline feature construction.
Constantly check with the product owner about the acceptability of a feature.

Ideally each team should be able to come up with at least one thing to improve on.

The Epic Conclusion

After a mere 3 iterations (planning, execution, retrospective), each team will have constructed a city, and its usually pretty interesting (and honestly impressive) what they manage to get accomplished in the time available.

The two pictures below are the cities that my tutorial constructed during the activity.

S1 2016 Scrum City

Scrum City is a fantastic introduction to Scrum. It’s exceptionally good at teaching the basics to those completely unfamiliar with the process, and it touches on a lot of the more complex and subtle elements present in the philosophy as well. The importance of just enough planning, having good, self contained stories, the concepts of iterative development (which does not mean build a crappy house and then build a better one later, but instead means build a number of small, fully functional houses) and the importance of minimising interruptions are all very easy lessons to take away from the game, and as the organiser you should do your best to reinforce those learnings.

Plus, everyone loves playing with lego when they should be “working”.

Everyone Loves Lego: Part 1

March 22. 2016 0 Comments

Posted in:
agile
scrum
tutoring

The Agile Project Management course that I tutor at QUT (now known as IAB304) is primarily focused around the DSDM Agile Project Framework. That doesn’t mean that its the only thing the course talks about though, which would be pretty one sided and a poor learning experience. It also covers the Agile Manifesto and its history (the primary reason any of this Agile stuff even exists), as well as some other approaches, like Scrum, Kanban and the Cynefin Framework.

Scrum is actually a really good place to start the course, as a result of its relative simplicity (the entire Scrum Guide being a mere 16 pages), before delving into the somewhat heavier handed (and much more structured/prescriptive) DSDM Agile Project Framework.

Of course, Scrum fits in a somewhat different place than the Agile Project Framework. Scrum is less about projects and more about helping to prioritise and execute a constantly shifting body of work. This is more appropriate for product development rather than the planning, execution and measurement of a particular project (which will have its own goals and guidelines). That is not to say that you can’t use Scrum for project management, its just that it’s not written with that sort of thing in mind.

Often when tutoring (or teaching in general) it is far more effective to do rather than to tell. People are much more likely to remember something (and hopefully take some lessons from it), if they are directly participating rather than simply listening to someone spout all of the information. A simulation or game demonstrating some lesson is ideal.

Back in September 2014 I made a post about one such simulation, called Scrumdoku. In that post, I also mentioned a few other games/simulations that I’ve run before, one of which was Scrum City.

It is Scrum City that I will be talking about here, mostly because its the simulation that I just finished running over the first two tutorials of the semester,

A City of Scrums?

The premise is relatively simple.

The mayor of some fictitious city wants a scale model of their beloved home, so that they can display it to people and businesses looking to move there.

Those of you familiar with the concept of a Story will notice a familiar structure in that sentence.

As a (mayor), I want (a scale model of my city), so that (I can use it to lure people and businesses there).

Like all good stories, it comes with some acceptance criteria:

All items in the model must be made out of lego.
Roads, lakes and other geographical features may be drawn.
The model must fit on a single piece of A3 paper.

Now that I’ve set the stage, I’ll outline the process that I went through to run the game over the first two tutorials.

Breaking Up Is Hard

I have approximately 55 students in my tutorial. For most of the work during the semester, I break them into groups of 5 or less (well, to be more accurate I let them self-select into those groups based on whatever criteria they want to use).

For Scrum City, 11 groups is too many. For one thing, I don’t have that much lego, but the other reason is that acting as the product owner for 11 different groups is far too hard and I don’t have that much energy.

The easy solution would be to try and run one single Scrum City with all 11 teams, but the more enjoyable solution is to pit one half of them against the other. Already having formed teams of 5 or less, I simply allocated 4 of the larger teams to one side and the remaining teams to the other.

Each side gets the same amount of materials, and they both have approximately the same number of people, so they are on even ground.

The Best Laid Plans

The very first thing to do is to get each group to elect/self-select a Scrum Master. For anyone familiar with Scrum, this is a purely facilitative role that ensures the Scrum ceremonies take place as expected and that everyone is participating appropriately. It helps to have a Scrum Master, even for this short activity, because they can deal with all the administrative stuff, like…

The creation of the backlog.

Creating the backlog is a good opportunity to explain how to create a good story, and the concept of acceptance criteria. Each group needs to figure out a way to farm out the production of around 40 stories, each describing a feature that you (as the mayor) wants to be included in the model. Participants should be encouraged to ask questions like what it means for a one story house to be done (i.e. how big, to scale, colour, position, etc) and then note down that information on an appropriate number of index cards (or similar).

Each backlog should consist of things like:

5+ one storey houses (as separate cards)
5+ two storey houses (again, separate cards)
5+ roads (of varying types, straight, intersections, roundabout, etc)
hospital
stadium
statue
bridge
police station
fire station
…and so on

It doesn’t actually matter what elements of a city the stories describe, it’s whatever you would prefer to see represented in your city.

Spend no more than 30 minutes on this.

As Big As Two Houses

Once the backlog is complete, its time for the part of planning that I personally hate the most. Estimation.

That’s not to say that I don’t understand the desire for estimates and how they fit into business processes. I do, I just think that they are often misinterpreted and I get tired of being held to estimates that I made in good faith, with a minimum of information, in the face of changing or misunderstood requirements. It gets old pretty quickly. I must prefer a model where the focus is on delivery of features as quickly as possible without requiring some sort of overarching timeframe set by made up numbers.

Each group will be responsible for ensuring that every story they have is estimated in story points. A lot of people have trouble with story points (especially if they are used to estimating in hours or days or some other measure of real time), but I found that the students were fairly receptive to the idea. It helps to give them a baseline (a one storey house is 2 points) and then use that baseline to help them establish other, relative measures (a two storey house is probably 4 points).

There are a number of different ways to do estimation on a large number of stories, but I usually start off with Planning Poker and then when everyone gets tired of that (which is usually with a few stories), move over to Affinity Estimation.

Planning Poker is relatively simple. For each story, someone will read out the content (including acceptance criteria). Everyone will then have a few moments to gather their thoughts (in silence!) and then everyone will show their estimate (fingers are a good way to this with a lot of people). If you’re lucky, everyone will be basically in the same ballpark (+- 1 point doesn’t really matter), but you want to keep an eye out for dissenters (i.e. everyone things 4 but someone says 10). Get the dissenters to explain their reasoning, answer any additional questions that arise (likely to clarify acceptance criteria) and then do another round.

Do this 2-3 times and the estimates should converge as everyone gets a common understanding of the forces in play.

Planning Poker can be exhausting, especially during the calibration phase, where there is a lot of dissent. Over time it usually becomes easier, but we’re talking weeks for a normal team.

Once 3-4 stories have been estimated using Planning Poker, its time to switch to Affinity Estimating.

Spread out all of the unestimated stories on a table or desk, place stories that have been estimated on a wall in relative positions (i.e. the 2 pointers at one end, 10 pointers at the other with room in between) and then get everyone in the group to silently (this is important) move stories to the places where they thing they belong, relative to stories with known estimates. Everyone story should have an estimate within about 5 minutes.

Keep an eye on stories that constantly flip backwards and forwards between low and high estimates, because it usually means those stories need to be talked about in more detail (probably using Planning Poker).

Affinity Estimating is an incredibly efficient way to get through a large number of stories and give them good enough estimates, without having to deal with the overhead of Planning Poker.

Again, spend no more than 30 minutes on this.

What’s Important To Me

The final preparation step is prioritization.

Luckily, this is relatively simple (and gets somewhat repeated during the planning sessions for each Scrum City iteration).

As the mayor (i.e. the product owner), you need to provide guidance to each team as to the relative importance of their stories, and help them to arrange their backlog as appropriate.

Generally I go with basic elements first (houses, roads, etc), followed by utilities (hospital, school, police station, etc) followed by wow factor (statue, stadium, parks, lake, etc). Its really up to you as product owner to communicate the order of importance.

You can (even though it is not Scrum) introduce the concept of MoSCoW here (Must Have, Should Have, Could Have, Won’t Have) and label each story appropriately.

The most important thing to have at the end is some measure of the priority of each story, so that when the teams start planning for their iterations, they can create a basic delivery plan taking your preferences into account.

Because the prioritization is less of a group activity than the others, you only really need to spend around 10-15 minutes on this.

To Be Continued

This post is getting a little long, and I’m only about half way through, so I’ll continue it next week, in Everyone Loves Lego: Part 2.

There will even be pictures!

The Swallows Return To Capistrano

March 15. 2016 0 Comments

A while back (god almost a full year ago), I posted about the way in which we handle environment migrations, and to be honest, it hasn’t changed all that much. We have made some improvements to way we handle our environments (for example, we’ve improved our newest environments to be built into tested, versioned packages, rather than running directly from source), which is good, but the general migration process of clone temp, tear down old, clone back to active, tear down temp hasn’t really changed all that much.

Over time, we’ve come to realise that they are a number of weaknesses in that strategy though. Its slow (double clone!), its not overly clean and it can rarely lead to all of the data for the environment under migration being destroyed.

Yes, destroyed, i.e. lost forever.

This post is about that last weakness (the others will have to continue existing…for now).

Explosions!

In the original cloning scripts, there was an ominous comment, which simply said “# compare environment data here?”, which was a pretty big red flag in retrospect. You can’t always do everything though, and the various pressures applied to the development team meant that that step became somewhat manual.

That was a mistake.

After running a number of migrations across a few different environments (using basically the same concepts), we finally triggered that particular tripwire.

An otherwise uninteresting environment upgrade for one of our production services completely annihilated the underlying database (an EC2 instance running RavenDB), but the script gave no indication that anything went wrong.

Luckily, this particular service more of a temporary waystation, acting as a holding area facilitating the connection of two applications through a common web interface. This meant that while the loss of the data was bad (very bad), it wasn’t a problem for all of our customers. Only those people who had items sitting in the holding area waiting to be picked up were affected.

Obviously, the affected customers were quite unhappy, and rightfully so.

To this day I actually have no idea what went wrong with the actual migration. I had literally run the exact same scripts on a staging environment earlier that day, and verified that the same data was present before and after. After extensive investigation, we agreed that we would probably not get to the root of the issue in a timely fashion and that it might have just been an AWS thing (for a platform based on computers, sometimes AWS is amazingly non-deterministic). Instead, we agreed to attack the code that made it possible for the data loss to occur at all.

The migration scripts themselves.

Give Me More Statistics…Stat!

Returning to that ominous comment in the migration scripts, we realised that we needed an easy way to compare the data in two environments, at least at a high level. Using a basic comparison like that would enable us to make a decision about whether to proceed with the migration (specifically the part that destroys the old environment).

The solution is to implement a statistics endpoint.

The idea is pretty simple. We provide a set of information from the endpoint that summarises the content of the service (at least as best we can summarise it). Things like how many of a certain type of entity are present are basically all we have to deal with for now (simple services), but the concept could easily be extended to include information about any piece of data in the environment.

Something as simple as the example below fills our needs:

{
    data: {
        customers: {
            count: 57
        },
        databases: {
            count: 129
        }
    }
}

A side effect of having an endpoint like this is that we can easily (at least using the http_poller input in Logstash) extract this information on a regular basis and put it into our log aggregation so that we can chart its change over time.

Making It Work

With the statistics endpoint written and deployed (after all it must be present in the environment being migrated before we can use it), all that’s left to do is incorporate it into the migration script.

I won’t rewrite the entirety of the migration script here, but I’ve included a skeleton below to provide an idea of how we use the comparison to make sure we haven’t lost anything important on the way through.

function Migrate
{
    params
    (
        #bunch of params here, mostly relating to credentials
    )
}

try
{
    # make current environment unavailable to normal traffic
    
    # clone current to temporary
    
    if (-not(Compare-Environments $current $temp))
    {
        # delete the temporary environment and exit with an error
    }
    
    # delete current environment
    # clone temporary environment into the place where the current environment used to be
    
    if (-not(Compare-Environments $current $temp))
    {
        # delete the new environment
        # keep the temporary environment because its the only one with the data
    }
}
catch
{
    # if the current environment still exists, delete the temporary environment
    # if the current environment still exists, restore its availability
}

function Compare-Environments
{
    params
    (
        $a,
        $b
    )
    
    $aEndpoint = "some logic for URL creation based off environment"
    $bEndpoint = "some logic for URL creation based off environment"
    
    $aStatistics = Invoke-RestMethod $aEndpoint #credentials, accept header, methods etc
    $bStatistics = Invoke-RestMethod $aEndpoint #credentials, accept header, methods etc
    
    if ((ConvertTo-Json $aStatistics.data) -eq (ConvertTo-Json $bStatistics.data))
    {
        return true;
    }
    
    return false;
}

Summary

The unfortunate truth of this whole saga, is that the person who originally implemented the migration scripts (I’m pretty sure it was me, so I take responsibility) was aware of the fact that the migration could potentially lead to loss of data. At the time, the protection against that was to ensure that we never deleted the old environment until we were absolutely sure that the new environment had been successfully created, making the assumption that the data had come over okay.

In the end, that assumption proved to be our undoing, because while everything appeared peachy, it actually failed spectacularly.

The introduction of a statistics endpoint (almost an environment data hash) is an elegant solution to the problem of potential data loss, which also has some nice side effects for tracking metrics that might not have been easily accessible outside of direct database access.

A double victory is a rare occurrence, so I think I’ll try to savour this one for a little while, even if I was the root cause of the problem.