Sunday, 14 June 2015

A failure to do bva

Boundary value Analysis ...
... is a simple test technique.  It is taught on all introductory software testing courses.  The theory is that you split input data into sets of valid and invalid input and then test at the boundary between valid and invalid data.  Easy example, a function accepts an integer between 5 and 10.  Now ignoring the invalid sets as they approach the upper and lower limits of the integer data types you would use the following tests:
  • lower bound
    • 4 - invalid
    • 5 - valid
    • 6 - valid
  • upper bound
    • 9 - valid
    • 10 - valid
    • 11 - invalid
You obviously also test values in the middle of each set such as 3 or 7.  Assuming that the implementation of the function is approximate to the spec then this should be a reasonable set of tests to run.  Lets try a real world example:

A credit card company has a system that generates card security check (CSC) number (the last 3 numbers on the back of your card) and a system that checks that during a card not present transaction (like when you buy something online) the CSC is valid.  A CSC has the following properties:

  • 3 digits long
  • has a min value of 001
  • has a max value of 999
The test case of a CSC having the value of 000, is that valid or invalid? Not sure? well a test case you should try.

I recently got a new credit card to replace my older ones which were due to have a interest rate rise. The cards arrive and one has a CSC of 000.  I think no more about it apart from, wow that's going to be easy to remember.  Tonight my wife needed to make a purchase online.  Since it was for work I thought we would buy it on the credit card so that when she was reimbursed the money could be applied direct to the credit card.

To my concern the transaction was declined.  I checked online that there was ample credit for the purchase.  There was.  I called the company and asked why the card was declined.  Sanjay (the call taker) advised me that I had mistyped the card security number.  I mentioned that there was NO chance of that since it was so easy to remember (000).  He put me on hold.

Yes Mr Yates.  There is a problem.  The CSC 000 is considered invalid by our system.  As a security precaution we have cancelled all of your cards.  We are sending you new ones in the post.
Excuse me! I replied.  Why is that number considered invalid when it was one of your systems that generated it and printed it onto a card?  Surely this is a boundary value that would have been tested?
Sanjay was very apologetic and credited me £25 in way of an apology (so that's the purchase paid for) and allowed me to use my wifes card to complete the purchase before he canceled all of the cards to process the request for new cards to be issued.

We all make mistakes but there is a compound failure here.  Should the value 000 being fed into the card processing system really cause all the cards associated with an account to be blocked?  I'm not saying who the company is but I wonder if someone else tried a card not present transaction using a CSC of 000 would all their cards be blocked as well?  I'm sure it was never tested as the CSC generation system should never have issued a card with a CSC of 000

So what have we got:
  • 2 systems that have the same spec of what should be valid and invalid, however different implementations.  One system considers the edge case 000 valid and the other invalid. 
  • A system that doesn't recover from a card not present transaction having a CSC of 000.  Instead defaulting to the 'safest' behavior of blocking all cards associated with the card 
  • Potential opening for a test consultant? 
So I am bit annoyed and inconvenienced and I acknowledge that the chance of a card being issued with a CSC is  1/1000 but if a simple test case has been written this wouldn't have been an issue.
Also means I now have a great example when teaching boundary value analysis.

Friday, 24 April 2015

Difference between load and stress - using a metaphor

Load or stress testing a component are two different test techniques that often get confused.  Here is an analogy which I have modified from a conversation I had with James O'Grady

A load test is driving the car for 874 miles at an average speed of 60mph, in 5th gear, while using the air conditioning, cruise control and CD player.  Using lots of the capabilities of the car at expected limits for a length of time.  During and at the end of the journey we would expect the car to still be operational and all the dials on the dashboard to be reading nominal values.  A stress test is a completely different type of test.

In a stress test we want to push the system beyond its limits.  Often the limits will not be clear and so often the test becomes exploratory or iterative in nature as the tester is pushing the system toward the limits.  If we reuse the driving analogy we might start the same journey but now drive at 70mph in 3rd gear.  Initially we think this might be enough to stress the car.  After 60 minutes we increase the stress by removing some of the car oil and deflating the tyres.  Now some of the dashboard lights are showing us that some of the car components are stressed.  We then remove some of the coolant fluid and remove a spark plug.  Now the car is seriously under stress.  All the lights are on and eventually the car gracefully stops operating and we are forced to steer to the hard shoulder.  Once safe we re-fill all the fluid and oil, re-inflate the tyres and repair the spark plug.  Now we are able to restart the car and resume our journey driving properly.

A stress test is pushing the system beyond the limits it is designed to run at either by restricting resources or by increasing the workload (or often both).  This is done until the system either gracefully shuts down or restricts further input until it is now longer under stress conditions.

Both tests are heavily contextual as it relies on a deep understanding on how the software will be used in the wild.  Will a customer use the software for a long period of time under a load condition or do they just use it in short bursts.  This question is more important when you consider software built in the cloud.

If your software is built in the cloud and you are re-deploying every 2 weeks then your view of load and stress testing will be different to testing an on prem application as the operational realities of using that software are contextually different.

Wednesday, 1 April 2015

I am a runner ...

I am still running.  Had a few weeks off while I recovered from chesty coughs and had to repeat a few weeks to regain fitness after the cough BUT.

I am a runner!

I know as Laura (the 'voice' of the NHS Couch to 5K podcasts) said I was.  She gives you this reward at the end of the last run of week 6 (25 mins of continuous running)  and it brings such a wave of emotion.  6 (or in my case a few more) weeks of hard physical exercise and mental fortitude to keep going finally pays off.  I am a runner.  I can run.  

I still have my goal of a 30 min 5K to achieve and I know to do that I have to build more pace and more stamina into each run.  However I have another 9 runs to do that in and it all feels achievable.  

Earlier in the course there are certain runs that fill you with horror as the length of time spent running is cruely ramped up.  The 3 min run in week 3, The 5 min run in week 4 and the largest of them all the first 20 min run at the end of week 5.  But I have conquered them and there are no more left, just a gentle increase in duration till we hit 30 mins.  Feel a little like Frodo after throwing the ring into mount doom.  The main obstacles have been completed.  Sure it is still a long way home to the Shire.  But I've got this far, I can keep going to the end.

I have also found out that sharing my progress has inspired two others to start running.  One is about to start week 4 and the other week 1.  I never EVER thought that me doing exercise would inspire someone else.  I was always the one that needed the inspiration to do anything physical.  

I've lost weight, my belts are all on the last hole and I need new trousers.  My shirts fasten around the collar and suit jackets are no longer straining the buttons.  This feels terrific.

If I ever meet Laura I owe her a drink.  The podcasts have tangibly changed my life and made me healthier where gyms and diets have failed.

Just need to keep running and finish what I started and do a sub 30min 5 K run.  But this is no longer a pipe dream.  It is realistic, the hard work is over I just need to keep going

Monday, 26 January 2015

Keep on Running

10 runs into the year and I am surprised that:

  1. I am still running even though
  2. I was surprised with how incredibly unfit I was and that surprises me that
  3. I am actually getting better
The couch to 5k programme that I am following now has me running for 3 minutes.  180 seconds of running doesn't seem a lot, but 10 runs ago I couldn't go for 60 seconds so this is a massive improvement.  My last run detailed here: was the first run that I didn't do as a loop.  I ran from the Mountbatten centre through to cosham centre.  Although not the greatest distance, mentally it felt great to start in one place with a certain destination in mind.  At this point in my training plan this has been very helpful.

As I said when I started this I have been gathering statistics on every run and averaging them out over each week of the C25K programme.  I've decided to make this public in case anyone is interested.  Apart from the raw data I have learnt a few other things:

  1. Running with spectacles, in the rain, while getting too warm means you spend way too long trying to wipe away the rain and haze.
  2. Hills are hard, but only in one direction
  3. Never judge a run by your heart.  I can feel that a run has gone badly but the realz of the stats show that the run was awesome
  4. Running makes me feel so much better

Sunday, 4 January 2015

1st run of the year

After my new years resolutions I had to commit and actually run.  Naturally being a nerd this activity required tracking and monitoring. I spent quite a bit of time working out how to track my run, what I was going to listen to and where I was going to run.

Tracking my run.
I have started using run keeper on my phone for this task.  It tracks my run via built in GPS, gives me details of my progress every 5 mins and then saves all the stats of my run to the cloud.  I did consider a fitbit or jawbone device but this app was free and seems to do a fine job.  The app also allowed me to design routes online and then view them on my phone.  Quite handy to see where I was going to be running and how far that was in advance rather than just be running aimlessly.

What to listen to.
Instead of just picking a random running album I decided to use the NHS free couch to 5km podcasts.  Each one uses intervals of running and walking to build up stamina over a period of 9 weeks where hopefully I should be able to run 5km.  I transferred the podcasts to google play and then stream them to my phone.  Having someone tell you when to run and when to walk was quite helpful

So no need for any new toys.  Amy had already bought me some trainers and I have clothes that I can run in.  However since my phone was going to be my tracker and music player I invested in a karrimor arm band to house my phone and a pair of runners headphones.  Both were half price.  The headphone were very comfy and the arm band although too tight to fit on my upper arm kept my phone safe on my lower arm.

Wednesday, 31 December 2014

New years rulins

So it is another new year and as is customary I wanted to make some resolutions.  Reading my news feed on Twitter I saw a re-tweet from Lisa Crispin @lisacrispin pointing me toward this page which is a list of new years rulins from a musician called Woody Guthrie.  I have never heard of him before, but I liked his list and how through reading it I started to see a part of his life. So taking his lead (and his list) I have made my own heavily based on his, but adding my own thoughts and removing some of his less transferable entries.  I doubt this will be a year that unlike all others I keep to them all. But at least I can try.

Wishing everyone a happy and fulfilling 2015



Friday, 25 April 2014

Why No Test Cases

Why No Test Cases

The test team I lead do not use test cases.  We believe that testing is an active,  changing exercise and that rigid test cases does not support this rate of change.  I will give my arguments for not using test cases before explaining how we do plan and track our testing

What is a test case?

A test case defines a process that will show if the software you are testing displays a particular desirable attribute. It consists of 3 main parts:
  1. Starting condition
  2. Method
  3. End condition
And may have other parameters
  1. Priority
  2. Owner
  3. Status (open, in progress, blocked, closed)
  4. Weight
A low level test plan is a collection of these test cases for a particular module or capability of the software being tested.  The test plan is reviewed and approved by stakeholders rather than an individual review of each test case.

Reasons for change

Not all test cases are created equal

To paraphrase George Orwell "all test cases are created equally, but some test cases are more equal than others".   Each test case differs from others in terms of it’s:
  • Risk
  • Complexity
  • Duration
Tests when executed reduce the risk of faulty software being shipped to the customer.  From the customers perspective there are different levels of risk, some acceptable and some not.  The risk of a small input validation error within a utility program might be just annoying but other errors could cause a loss of money within a business or even loss of life for aviation or medical software.  Different tests mitigate different amounts of risk.

The complexity of a test case can also vary.  Modern software is rarely executed on it’s own and will likely have many integration points.  Configuring, executing and monitoring the software during the test can be a complex task.  Some simpler test cases though are a lot less complex.

Tests not only take time to execute but also can take considerable time to prepare the required test configurations and test harness to execute the test.  As well as time required by the tester to complete these tasks, longevity or workload tests can then take an extended period of elapsed time to execute the test. Validation of the test execution data once the test has executed will also add to the duration of the test case.

Some tests will have a natural ordering between them.  Quite often a test will build on top of previously run tests.  These tests will require the tests run earlier to pass in order for this test to be run.  

The above points show how varied and different a set of tests can be.  Defining these varied and many tests into a single unified ‘test case’ hides the differences between them.  Once in this form they are counted as if they were all identical in duration, risk and complexity. Boiling this complexity into a single test case template just hurts the testing.

Preparatory work

Any test case will require some degree of preparatory work before the 'actual test' can get underway.  This might be configuring automation, defining workloads or developing test harness to run the test.  Is this work part of the testing process or just necessary delay that has to be done before the real work starts.  Fellow testers seem to fall into one of two main schools of thought.  One group see prep work as a required evil, work that must be done but isn't part of the actual testing.  The other group are of the thinking that as soon as you start to interact with the new parts of the software, the process of testing has begun.  Either way how do you classify this work?  It doesn't fit into the test case template at all and yet the test cases are dependant on it being done.  When a team has 10 test cases remaining you don't know what work needs to be done to execute those tests?  They might need 3 days of prep work to be done first.

Once the prep work is completed you are left with a set of test cases and a period of time to complete them in.  This turns the test case into a metric and ...

Someone always wants to track them

There are x number of test cases left to do and y amount of time left to do it.  Simplistically this is modelled by a straight line graph.  The model assumes that each test can be executed in the same amount of time.
image (1).png

However as we have said before not all test cases are equal, so to assume that we can execute through them in a linear fashion is short sighted.  Also as test work progresses defects are raised which will require recreation and verification and this takes time, time that cannot then be spent on executing tests.  Responding to this the chart is often re-drawn as an s-curve like this one:


Progress is charted against the graph and teams either are congratulated or receive 'management focus' depending on the state of the chart.  This misses the point and it's largely the fault of the test cases.  Progress through a list of predefined tests means nothing.  For the progress to be meaningful you need to assume that the quality of the tests are high and that the tests mitigate the majority of the risk for the customer.  Even if you can make that assumption, progress through the test cases isn't enough to judge progress you also need to consider:
  • How many defects testers are finding.
  • The spread of defects across the product
  • What tests have actually been done yet (have we just done the easy ones)
  • what if we have completed 90% of the test cases and found no defects?
Without this additional information the complexity of testing is abandoned in favour of an overly simplified metric which is too narrow to be of any real use.

They don't react well to change

Testing is an exploratory sport, an experienced tester working through any set of test cases will question their approach based upon; progress through the test case, prior defects and often just a gut feeling.  Regardless of how much thought and effort is applied to the formulation of the test plan, the planned testing and the actual testing will be different.  This is because the tester is learning about the software and how it was implemented as they execute the tests.  A single test case might expose a bug which is part of a bug cluster within a piece of code.  To fully examine the area new test cases will be created and executed on the fly. This ad-hoc, off piste or exploratory testing should result in further test cases being written and added to the plan.  Adding extra tests often infuriates managers trying to track progress.  The new tests are driving further quality into the product and their value isn't in the testcase being written but in it being executed and the defects that they find.  

Test approaches such as scenario based, use-case or exploratory do not lend themselves to a formal test case report.  They often do not have a particular method or starting condition but instead tend to rely on experienced confident testers examining the product in the same way the intended customer would.  These approaches are better tracked using a time box or complexity rating like duration or story points.  

Test cases seem like a good idea.  However in reality they are too simple a model to use in practice.  A single test case can define the intent of a test but the model breaks when used to suggest progress or the quality of the product.

I explained this to a junior member of my team and drew parallels between some exploratory testing he wanted to achieve and a prototyping story that development were planning.  Both didn't have a clear plan of implementation but they did have allotted time and a focus on learning how to achieve their aim by 'working with the software'.

This was when an alternative became clear.  This is just software engineering.  Can't we measure software development and in the same way?  Would that makes things clearer

Whats the alternative

Test work is just engineering work

If you stop thinking about testing as being the execution of test cases and instead as another type of software engineering work then an alternative approach becomes obvious.  Testers still write code, examine specifications and solve problems in the same way as a developer.  We don't track developers by the number of changesets they produce so why not treat testers in the same way and use the same tracking artefacts that are used in development.

We now use test stories for our system. Test work.  No test cases in sight.

A Test Story is similar to a development story.  It outlines the objective of the work, priority, owner and complexity.  Except in a test story the objective could be more deconstructive than that of development story.  As a dev story is constructing something new, test are de-constructing it to reveal defects.

Test Stories can be planned for an iteration and split into multiple child tasks that describe each individual unit of work.  The test story is given an estimate of the time it might take, a priority of how important it is that this test is executed and a story point rating to describe how complex this test is to implement and execute

Handling defects

Defects found during the story executuon are linked to the work item to show the 'fruit' of that story.  Development and test stories can be associated to show the testing that will be applied to new capability

Defects that are found will often require recreating to provide further diagnostics and require verification once a fix is supplied. This work will need to be planned and committed to an iteration just like any other piece of work.  We have created queries to total the amount of time spent 'working' these defects.  This can help to explain delays in test progress as testers are working on defects.

Showing progress
We can now show our progress by explaining the work we have committed in an iteration and our burn down throughout the iteration.  Not using the simple metric of testcases focuses people to ask more interesting questions to determine the quality of the project:
  • What types of testing have we done?
  • What defects have we found?
  • How much time do we need to complete the high priority work
All more valid and useful than a single percentage of test case completion.

We have been using test stories as a replacement for test cases in the system test team for a few releases.  Using them has given us the following benefits:
  • The total amount of  test work that needs to be done including  prep, defect recreation and verification as well as actual testing.  We can plan our iterations a lot better.
  • Tracking is a lot easier as we can understand the duration as well as the complexity of the work rather than simple counting measures
  • We are starting to ask better questions about the state of the test project.

We still are not perfect.  We are still refining our process to ensure we estimate tasks better and ensure we don't fall into the trap of just counting test stories.  Overall we have seen an improvement in out efficiency.  Who needs test cases?  Not us!