Friday, July 16, 2010

Cucumbers, and why it suddenly matters that I suck at regular expressions

This week I have been playing with a new testing tool. Well, sadly not all week as my wonderful HP 8710 laptop snuffed it on Monday morning and despite my best efforts could not be coaxed back to life. This blog post (and my playing with the new testing tool) comes to you from my new Lenovo T410. It’s a bit “toy sized” but it does boot in under 10 minutes (a welcome contrast to the HP) and the build quality is better (although it’s hardly got the aesthetics of a MacBook Pro.)

Anyway, the new testing tool I’ve been playing with is Cuke4Nuke which allows Cucumber to work more easily with .NET projects. That sentence probably is pretty baffling so let me back up and explain what all this Cucumber-y themed goodness is about.

You have (hopefully…) heard of Test Driven Development (aka TDD) as an approach to developing software. For those not too clear what it is, a quick explanation goes like this: Instead of writing your application code and then writing a unit test afterwards (you are writing unit tests, right?) you start out writing your unit test first. Now, this may at first blush sound somewhere between stupid and impossible. If you’ve only heard that one line explanation you’re missing a subtle but key point: when you start writing that unit test first, you need only write enough for it to fail. Then you go remedy that fail by writing production code. Obviously at the outset that point is reached pretty quickly…the moment you reference the class under test (which doesn’t exist yet!) your test will fail. You can’t even compile it:

public class TestWidget
public void Widget_Assemble_Success()
Widget w = new Widget(); // can’t compile

So then you switch to your application production code and implement the class under test. But (and this is important) only enough to make the test succeed. At the outset that doesn’t require you do any more than simply create the class:

public class Widget
//implementation to be done

At this point your test will now compile (and pass, kinda) and you can work more on the test until, once again, it breaks or fails. In the example here that’d be trying to call the Assemble() method on the Widget class. Again we couldn’t compile as this method doesn’t exist in the production code class. We’d remedy that over there and then finally add some kind of Assert into our test, which would fail and then finally we’d get to the meat of implementing the method under test.

That little “dance” back and forth between the test and the class under test may sound a little convoluted and unnecessary, but it’s what helps tease out the initial design of your class under test. It’s what puts the “test driven” in test driven development. It really makes you think about the API, the method signatures, naming etc. of your class under test and after you have the “aha” moment that comes with this approach it’ll all fall into place. Suffice to say I’ve found it really helps not just with driving the design but also with maintaining focus too. Trying to write sizable amounts of production code before unit tests was always liable to lead to me wandering all over the place. That could just be me of course.

OK, back to Cucumber. Cucumber is not in fact a TDD tool or framework. But it’s related. It’s a BDD (behavior driven design) framework. This is a step forward in the evolution of test first development, and for me brings together a number of interesting threads:

  • user stories

  • specification details, acceptance criteria

  • customer/product owner involvement

There’s a (sometimes unpopular) truism in software development: it doesn’t really matter what the requirements say, the software does what it does. The truth is in the source code. That is the only artifact that is up to date and can unambiguously (hard to read code aside) tell you what the software does.

Now what’s really cool here as far as I’m concerned is that Cucumber (and other BDD related tools) let you bring the specification in alongside your source code. Not just in a “store the requirements doc in version control” sense, but something much more useful. Your specifications in Cucumber are written as a series of features (think stories) along with some scenarios (think concrete examples of expected behavior) which then get hooked up with some glue code to the actual production code to test that they work. In other words, the plain English requirements are now an executable specification. With the click of a button you can test which of your features are correctly implemented.

This makes a boatload more sense than deriving tests from a requirements document. The tests *are* the requirements document.

This elegant brilliance (yes, I gush) comes at a price however.

Firstly, your feature requirements have to be written in a particular format. This doesn’t actually seem to be too onerous, and results in some pretty readable material. Here’s an example feature for a system designed to act as a database of hiking trails:

Feature: Search for trails
As a hiker
I want to find nearby trails
So I can go for a hike

Scenario: Search for trails nearby
Given my location is "Rollinsville"
And I'm willing to travel up to "10" miles
When I search for nearby trails
Then "Crater Lakes" is listed in the results

The key part here is the “given, when, then” section. This is what we will end up tying to an automated test. Above it, as you can see, is our user story in conventional format providing background and context.

You can imagine that we might have additional scenarios here related to finding trails that allow dogs, disallow mountain bikers and ATVs etc.

Despite the constraint of the specific structuring required here (and there is more richness available than this simple example) it seems to me that you could get a long way with this approach, and it’s not beyond the realms of possibility for these features to be directly authored by the Product Owner or business analyst.

The second portion of the “cost” of this approach is implementing the code to tie these scenarios to the system to test it. This is where regular expressions (regex) and my weakness with them come in to play.

If you’re not familiar with regular expressions, they basically provide a very succinct means of performing pattern matching operations on strings. See for more information.

To connect the scenarios up involves writing regular expressions that match the text in the “given, when, then” stanzas and pulls out the key inputs to the tests. In the example above this would be the location (Rollinsville) the distance the hiker is prepared to travel to a trailhead (10 miles) and the success criteria (Crater Lakes).

Those regular expressions, which are wrapped up in C# Cucumber attributes on methods to identify the given/when/then processing look like this:

[Given("^my location is \"(.*)\"$")]


[When("^I search for.*$")]

[Then("\"(.*)\" is listed in the results")]

All of which may or may not read easily for you, depending upon your experience with regex. Mine amounts to little more than a fairly limited quantity. I’m pretty excited about the possibility of executable specifications, BDD and testing applications this way and plan on trying it out in anger over the coming months.

And this is why it suddenly matters that I suck at regular expressions.


  1. You don't have to know too much of regular expressions to be good with Cucumber. In fact, being too good at them can lead you to write overly complex expressions no one else on your team can maintain. I just posted on the few regular expression patterns you need to be effective with Cucumber:

  2. Tools like Rubular and RubyMine make Cucumber much easier: