My other blog (in Hungarian) merhetetlen.blogspot.com

Tuesday, 20 September 2011

Preconditions

In a few hours every day I have to test a Death Star-like application. I know there is nothing special about it, but this DS is a naughty one. It has billions of settings, and because it is a shared environment it often happens that something has been changed in the cms shortly before testing, and the QA does not know about it. So when we run the tests we always have hundreds of false positive results. (it is better than a false negative but it is annoying)

So on a quiet day I've decided to create a Precondition framework.
It has two main targets:

  1. Make possible to add preconditions to every test we have, in a simple and common format.
  2. Make possible to evaluate those preconditions before test run and skip the tests with a message if the precondition fails.
The framework have to be somewhat independent from the test framework (at least the first part), and could be independent from the actual level and type of testing (unit, integration, UI, API...).

So how does it is look like? Like a Thermozodium esakii.

The marking part is annotation based. You can annotate your test methods and your test classes with one @Preconditions which may have more (at least one) @Precondition. The @Precondition has the following parameters: id = the unique id (in class level) of this precondition, these can be used for linking. type = any value from the CondiotionType enum (to collect the precondition types in one place). args = the required arguments for the evaluation. skip = for skipping the evaluation for a given value of a given test parameter (see later). description = a short description which will be in the message of the SkipException.
You can mark your class fields with a @TestParam annotation which can be used for a simple value checking of the args value of its @Precondition (see examples below).

The evaluation is triggered by a TestNG listener class which implements the IInvokedMethodListener, in its beforeInvokation method gathers the required annotations and calls the required class for the evaluation and throws and exception if it did not met.

So you can define any type of conditions you only have to implement it in the *.conditions package (or elsewhere) and create an enum value for it, and add it to the beforeInvokation method of the listener (its a huge else-if... no better solution yet).

Here is a few annotation examples (using TestNG):

 public class MyGreatTest{  
   /**
   * The test field what should have a specified value
   */
   @TestParameter(conditionID="testParamCondition")  
   public String myTestParameter;  

   /**
   * The field gets its value here
   */
   @BeforeMethod  
   public void setUp(){  
     //... do something  
     myTestParameter = someMethod();  
   }  

   /**
   * A precondition which describes the check of a simple value of a parameter (of course the implementation has to be created before)
   */
   @Preconditions(@Precondition(type=ConditionTypes.parameterCheck,args={"YouCanDoIt"},id="testParamCondition",description="To check the field has the required value"))  
   @Test  
   public void firstTest(){ 

   }  

   /**
   * This example is one of our most common one. Based on the values of the parameters of the test method we check something in the CMS and if it has a NULL set skip the test. For some reason we can skip the evaluation if we define the skip argument. It contains the value and the position of the test parameter we want to check. So in this case, we define: if the first parameter (which is firstParam in this case) has the value 'Atlantis' skip the checking of this condition.
   */
   @Preconditions(@Precondition(type=ConditionType.checkSomethingInACMSByXpath,args={"/universe//planet//region[@='Aqua']"},skip={"1","Atlantis"},description="This test requires an Aqua type region, except when the first parameter has the value: Atlantis"))
   @Test
   public void secondTest(String firstParam, Integer secondParam){

   }

 }  

Why is it good for us? You can check the precondition in the before method and skip the test if it does not match... If you have 1000+ tests it is useful to part your tests from precondition checking, and you don't have to change the code on a lot of places, and you can easily skip the evaluation if you remove the Listener for the actual run.

Wednesday, 7 September 2011

Quality measurement part 2

Hopefully I won't tell you anything what you don't know already. Why then am I writing these things down? Just for the writing itself, and because of a little hope, I could say something useful.

So how we want to generate our numbers. (I hope in time if we can see the scales I can convert these numbers something more funny, like rock bands or action figures, or poets.)

We live on the shady island of Java. So it determines a few things. Sonar is our system of displaying metrics. It is cool as I wrote it before, and I don't want to adore it again. Sonar uses PMD and FindBugs for static analysis (and has many many other metrics like LOC, Complexity, LCOM4...) and could use Jacoco and Cobertura (and others) for code coverage. It also has a plugin for Jira which can store the number of the tickets based on a custom filter, prioritized by their severity. So we have all pieces almost in place.

Almost...

Sonar has  a plugin called TotalQuality... Sounds like it does what we need. You can configure it, change the calculation, but you cannot extend its sources... well you can extend it if you are clever enough (I'm not...). It does not use Jacoco plugin (IT code coverage) and Jira... it is sad.

So I've created two requests to their Jira, and pray to God, Emilio and Martin to implement it...

Friday, 2 September 2011

Coverage to bring them all

After the sweet success with JaCoCo (the project was called off, but I 've brought out everything from it), I have decided to jump to the next level.

My old problem with integration test coverage (we use TeamCity) is that it eats a lot of memory... One of our projects has 4 war files which have to be deployed in an embedded forked container (cargo) and the regression suite has 3000 tests, a lots of them generated with TestNG DataProvider (always use the Iterator variant).
The build agent has 6GB RAM but it is not enough. I could reserve 4GB but it always died with OOM... So we did not have coverage (and static analysis results) for weeks...

So I decided to use JaCoCo here as well. Not because it uses less memory, but it can be used via TCP. And it is awesome!

We have the possibility to separate the test run and the container of the application under test but can still gather the coverage easily.
On our development server I've set up one of our tomcats to run with jacoco (put a line to the catalina.sh, even I could do it. The JaCoCo team says it won't eat too much memory, so you can run it like this.)
To gather the results we modified one JaCoCo sample code (provided by the developers), and create a simple jar from it, we pass the ip, the port of the jacoco agent, and the file name what will be used to save the coverage file, and a flag to configure the data flush (keep results or flush it). So after the failsafe plugin, in the post-integration-phase we can call it with a simple exec plugin... And we have the exec file like starting the container with the maven build.

OK, there is one addition. Before starting this maven build I've set up a dependency to our regular remote deploy TeamCity configurations....

But there were problems....


  • One of our projects has a clustered environment... no problem, we've modified the data collector to accept list of IP's and concat the last segment of the IP after the data file name property. After we had the exec files from all head-ends we use the antrun plugin to merge these into one. (unfortunately jacoco maven plugin does not support this feature yet)

So it's solved....


  • One of our projects has client side application (this is used by the tests).... no problem, use the misunderstood config of the failsafe plugin... now we will have a jacoco in the failsafe's JVM (we have to fork the failsafe to flush the results at the end of the tests).... And in the post-integration-test do the merge...


Now only one thing's left.... use the results :)

Coverage to rule them all

There are two things in my mind. Lets start with thing B:

I can't say it was love at first sight... I knew immediately that she is something new and good and effective, but she was so strange and uncommon... so I give up after a few date. But a few weeks ago I have to meet with her again. Mizz JaCoCo... Previously I always with Cobertura. She is simple, easy, a little mysterious, but to be honest, really hard to use for what I had to...

To understand what the situation was I have to introduce the other party member: Sonar. It is unbelievable, free and knows a lot. But it is tied up with JaCoCo if you have to store integration and unit test coverage for a project. And we had to. So I've downloaded JaCoCo again and read a few examples on sonar page and on its documentation site. Maybe I 'm stupid, put something was wrong with them. I can't make it work.... I read it again and again and again and I realized that I was stupid. I was following the post too blindly. JaCoCo is an on-the-fly instrumentation tool. It instruments anything in the JVM. Why the hack they give the bloody -javaagent to the failsafe plugin? In every example it was there, but it is nonsense. And I realized it is meaningless (not everytime, but get over it), I should have passed it the cargo container... (now I know, if I did not fork the container, and use another maven process to gather the coverage I could have reached the same result), but I can't understand why they hide this little information from the readers, or am I the one-and-only idiot who can't find out this immediately? (I know, I know...)

But after the revelation I could create a simple maven build to:
1.  compile the project
2. instrument it with cobertura
3. run unit tests against it
4. package the application without cobertura
5. deploy it to cargo (as soon as the Jetty plugin can be forked it can be used as well)
6. run the integration tests against it
7. gather the coverage
8. install it to local repo (end of the install phase, or we could run to verify phase... who cares :) )
and run sonar:sonar to:
1. analyse coverage files
2. analyse with findbugs and so on...
3. put the results to Sonar

And it was eeeaaasssy.
So beware! Don't be stupid! Give JaCoCo to the container...

Thursday, 25 August 2011

Quality measurement part 1

My Colonel asked my Regimental Commander who asked my Master Chief to find out how to measure the Quality of a Software. I know it is not unique, and there are hundred of pages and calculations out there in the Universe, but my mind collapsed when it tried to find out how.

I went through all phases of protesting: laughing, crying, bargaining...
We had to create a machine which eats numbers and splits a number, a beautiful, magical number which is comparable, descriptive, short and tells everything about quality.

In one of my stations I tried to make fun of it, to show its incongruity. I created metrics like:

  • # of [public methods | lines of code | methods ] / years of experience of the team
It is definitely about quality, shows the years of experience index of each method... (of course with the presupposition experience is related to the quality)

But at the end we have to find out something else what we can show to the Colonel. The problem is that we can measure billions of things from the architecture to the number of open tickets, or the number of exceptions in the logs or Customer Support calls, or what we can imagine. But we would not reach the Quality, only its part, its few aspects. I could imagine a code which has low complexity, well-commented, but poorly designed, or has hundreds open tickets in its issue tracking system, or its UI has bad user experience, and so on...
What I said to my superiors, we cannot produce a number what our Colonel wants, because it is impossible. We have to identify the measurable factors of quality, gather the numbers, create a number from these numbers, and define where is this number comes from, what it means (definitely not the Quality).

We've found these main factors:
  • Results of static and dynamic code analysis
  • Bugs and feature requests in the tracking system
(hopefully the open tickets could cover the User experience area and the fit of purpose, and anything else what we cannot get from a tool)

We must tell from the very beginning that all project has its own number, and we cannot compare or confront these numbers between projects. And this number itself still does not mean anything only its change in time.

And when we reached this point, I found it interesting: it is feasible, it shows something, it is understandable (mostly, and abstract enough) for the business, and acceptable for the engineering.

Great.

What do you think?

Go to the details in the 2nd part..