Tuesday, December 30, 2008

C/C++ memory management - programs insanities listed

This is a continuation of my entry from several days ago view

There are different kinds of memory related issues, which usually lead to raise of a runtime error. Let's try to define categories of those:
- memory leaks - memory is being allocated and not freed, program 'grows' in context of memory usage somewhat proportionally to the time of execution, usually leads for program to become unresponsive over time. It is quite difficult to diagnose without having memory debugging monitors at your disposal.
- dereferencing nulls - we try to access memory behind the pointer which wasn't yet allocated - leads to access violation runtime error right away. Fortunately this one is fairly easy to diagnose as the cause is very close to the symptom.
- memory corruption - in our program memory there is a region which doesn't contain data we expect it does - it can be either because memory was never initialized, or because there was writing out of bounds situation on the neighboring memory region. Program now contains a 'mine' - as soon as we foot on it (by reading from corrupted memory region) we will get incorrect data fed to our program datasystem and it will start behaving inconsistently. Usually leads to runtime crash error, preceded by some strange and nondeterministic behavior in areas directly or indirectly dependent on data read from corrupted memory error. Like for the first category memory debuggers come handy here as well.

Sunday, December 28, 2008

Very interesting website for people in software testing business

I would like to recommend a website I've just found: Randy Rice's Software Testing Site - A Resource for Software Testing Training and Consulting.

It has a lot of valuable content software testing and QA professionals could benefit from.

Saturday, December 27, 2008

Software Economics

I found interesting entry on stackoverflow.com which lists resources for Software Economics (not only Barry Boehm book).

Check it out: Economics of Software Development

Wednesday, December 24, 2008

C/C++ memory management - responsibility which comes with power

When Java popped up - designed as improvement of prevalent back then C/C++ family - we didn't have to wait long for announcements of soon-to-be-seen death of C family. Java offered platform portability, internal memory management and got rid of templates which were somewhat enough complicated construct for many developers to get lost there completely and avoid them at any cost. Java being wonderful language I still felt a bit connected with the last language trying to get as close to the processor and memory as possible. Luckily people started embedding software into nearly everything and virtual machine and garbage collection costs were easily translatable into cents and dollars per device and C and C++ didn't go away. I could still put in CV that I speak 'the last language for adults' and hope to get more from it than just indication of how old I have to be.

But then with being treated by compiler as an adult comes responsibility, you have to manage things which otherwise your JVM parent would manage for you. If you act like a child your applications will be crashing in the middle of documents creation, will behave in a non-deterministic fashion, will slow down and potentially make the whole operating system unresponsive.

For anybody who developed something in C or C++ it is probably obvious already that all those symptoms usually means: incorrect memory/resource managament in C/C++. Memory is not being freed or is double-freed, is not allocated or not initialized before reading, the writing to it goes out of bonds.

If we are lucky the application crashes right away (with access violation indication), unfortunately in most of the cases the 'patient is starting to show symptoms long after (in instructions time) the initial infection'. That leads quite often to great deal of frustration as the causes are very difficult to trace back.

To be continued...

Tuesday, December 23, 2008

Test Management - what makes you good at it?

So you are a test manager and you are not sure whether you are good at it?

Let's think on it. First of all what is it your employer expects you to do - that's quite simple - lead the team of testers and realize some testing strategy (which usually is defined simply: 'find maximum defects possible' or 'assess product overall quality').

So first you better be good with people and disciplined enough to stick to some sensible process of testing (which you can create yourself). If you want to have reasonable process of testing you should approach the task in structured and measurable manner (you need to make sure that you tested everything and you need to be able to provide some interpretation of the results). This is where test strategy, test plan etc come handy. As you proceed into testing you should be generalizing observations, figuring out theories on symptoms, and actively building metrics and measurements which explain in the quantified way software condition.

Next important thing is you better be thick-skinned. You will have to fight with the development team to get software to test on time, developers will not be happy that you found another application crashing input combination, or 'break their deadline' by rejecting the product after simple acceptance testing. Rationally everybody knows that messenger who brings bad news is just a messenger, but nobody likes him anyway. Your goals are somehow opposite to those of development manager (she is supposed to make application work and you are supposed to break the application (yes, yes I know - just to show it was broken in the first place)), so it's easy to get into conflicts. If you have this cool need to pleasure everybody this is not the job for you.

And finally you should be a mega-tester - so the more the hacking attitude the better (after all it takes a lot of skills to act like a monkey at the computer). You will need all features future users of the product would have: non-biased thinking, creativity on how to move against intuition, creative way to stress application etc.

If you have these skills I believe you are on the good way to break many application before they get broken after they leave your shop. Good luck.

Monday, December 22, 2008

Root Cause Analysis - practical approach

Check out the article by David M. Russell: http://www.daivrussell.com/Fishboning.pdf. It's a very good and pragmatic introduction to the method.

Quality Enabled Infrastructure - Bare Essentials

First of all software quality seems like a very good idea. Less defects or misinterpreted requirements means less rework, less maintenance, less lost opportunities.

And there is a large range of techniques, technologies and products which can help increase quality.

Shouldn't we use them all before we release the product?

The answer is simple: NO.

If you work on something which will be used only by you and two of your friends, I would even suggest accepting that it crashes from time to time if only it provides valuable results most of the time. Not to mention buying toolset for quality improvement which can cost hundreds of thousands. All it would give you is additional cost: time spent on implementing it, learning it, using it, interpreting results.

On the other hand if you are about to deploy your software to some hardware component which will leave your shop embedded into cars and driving in all directions with average speed 50 miles per hour you may want to make sure that you will not need to call all these cars back for the software update. Then the more the merrier on the software quality boosters shelf.

Obviously there is whole spectrum of software shops in between. How they should choose what to apply?

The answer isn't simple here, but I'm a believer in pragmatic approach to software quality which for me means: provide foundations and extend where and when it makes sense.

Foundations being:
* configuration management system
* facility to manage requirements (it can be Excel or even plain text file)
* facility to manage defect reports, enhancement requests
* some kind of quality indication (it can be as simple as plain number of defects to number of requirements)

Analyzing those on biweekly basis gives you an information which you can use to figure out what works and what not, whether applying some technique/technology would be freeing or tightening your resources and thus stretching or relaxing your constraints being budget, time etc. And to act on the information.

Thursday, December 18, 2008

Oral hygiene - word about automation

How do you make sure you still have teeth in your thirties - you look after them since day one. While your parents are catching up with sleeping after several heavy in 'mommy my tooth is coming' nights you should be starting cleaning your teeth. Well, your parents should help teach you that.

Is it enough to know how to brush your teeth? Not really - far more important is to know why you should be doing it. Is it sufficient to have them brushed consistently? Still not - a special ritual has to be integrated with other rituals you have. What in case when you don't have any rituals? Well if you don't sleep every 24 hours, or eat at least once a day then state of your teeth is not your biggest problem.

So we condition ourselves to brush our teeth right after a wake up, or right after eating, or right before leaving to school. After some time it costs us more effort not to do it than otherwise. What we are doing here: we automate the process.

That takes off a lot of hassle - like checking whether teeth should be cleaned in the first place, figuring out where to fit it into our busy schedules etc.

Additionally it's the only way to do anything consistently for a longer time period (try exercising every now and then and check your average).

The same stands for your project hygiene - if you want to have something done consistently, you have to make sure that activities required are integrated into your daily routine. Automation is the key. Fortunately computers come handy there, as there is nothing they do better than stick to the clock and doing repetitive task.

And believe me if you want your project clean you need to brush it every single day. Build up your tasks, schedule them and make sure that results are being delivered every morning.

Soon enough lack of report in your mail will feel like dirty teeth and you will be able to be in software project in your thirties.

Wednesday, December 17, 2008

Programming Languages Popularity Index

Ever wondered which programming languages are hip these days? Or what are the trends? Or whether C++ is being replaced by Java? I've just found this website from Holland which calculates index for those.

Check it out: TIOBE Programming Community Index

Very interesting way of calculating it:
"The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the number of skilled engineers world-wide, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, and YouTube are used to calculate the ratings."

And wonderful message: C and C++ are not going anywhere. :)

Code Duplication Removal ROI Calculator

I found some interesting website which provides visitor with nice return on investment calculator: http://www.semanticdesigns.com/Purchase/CloneCalc.html

The basis for the estimate is Brenda Baker paper “On Finding Duplication and Near-Duplication in Large Software Systems.” You can find it for example on IEEExplore (http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=514697&isnumber=11405).

Bottom line: on average 13%-20% of the code can be removed (as is a result of code duplication) making maintenance of the project significantly cheaper. Assuming that code maintenance is proportional to its' size - up to 20% cheaper.

So if you are spending $70,000+ on your software engineer and have 5 of them maintaining the code invest $5,000 on code duplication removal product and get your money back in less than a month. Within a year you will be $30,000 up in the blacks. On average of course. Not bad.

Monday, December 8, 2008

Test cases and test clusters - partition inputs to save your time

Imagine that you have just finished writing your new cool feature. Now, what is the simplest check whether the code really does what it is supposed to? Run it through simple inputs and check whether outcome matches? Great technique. It has a limitation though - it checks it against this specific case (please see my last entry on that as well). How can we extend that and test whole classes of inputs? Well, there is this simple and elegant technique, which is called equivalence partitioning which seems to be exactly about that.

So what's the idea? We divide our input vectors based on the outcome they are producing (different inputs go through different paths etc) and then work on the representatives and assume that the rest of the inputs from the specific domain will behave similarly. So for example if we play with function, which calculates absolute number we have two partitions - positive numbers and negative numbers and we need to test at least one of each to check whether our function works.

Given that partitions are defined correctly (outcomes are similar on any two given inputs from the cluster) and completely (all inputs are in clusters) we have very powerful tool for software testing in an easy way. We found out how to reduce number of tests to be applied to the minimum and still test all possible test scenarios. Nice job.

Unfortunately, in life, if the problem just got much simpler it usually means you delegated it somewhere else.

The task of figuring out partitions is now the place where we will be spending our time. It's not trivial (or more like very very far from trivial) to figure out what are those clusters and what are their boundaries - but it's still a very effective way to move from chaotic to structured testing.

How you can do it - analyzing your code most likely - you can calculate sets of equations and reducing it by fixing some of the unknowns as problem becomes too complex quickly (top-down approach) or just run test case through the debugger and build on it when stepping through decision points (bottom-up). Or any approach in between really.

And one last note - as always - interesting things happen near or on the boundaries - there is a new life promise there which make your inputs behave - well let's call it unexpectedly - make sure that you have your guards there.

Wednesday, December 3, 2008

User always chooses the wrong path - on data flowing through your code

Software development is so much fun



Developing the code is a very cool activity. Because of how creative and limitless it is and because of instant rewarding nature. You figure out what needs to be done - you model it, write it down, and execute almost immediately. Caboom! New shiny '4' is produced in your console as the answer to your '2' and '2' input. You just did an amazing piece on integers adding. You can move on to the next adventure.

And then software development suddenly is mundane


But then, can you really? I would suggest to try '3' and '2' inputs as well to make sure that your algorithm is adding (we expect '5') and not multiplying (which would get us to '6' and a bit of frustration when trying to claim money back from our savings account) or even better with '3' and '4' to make sure that it doesn't add '2' to the first argument. This for some time can be cool experience - as you are still playing in 'what I could’ve missed' game, but far before we get to '2147483647' + '1' the boredom sneaks in and just kills all the fun.

What makes the whole thing worse is that you are locked in what you know about the code you've just written, and you haven't managed to develop your brain significantly since then, so you are locked in the same mindset with all limiting consequences.

Bad news is that we have to do it and there is no way around it, good news is that there are tools and techniques available to shortcut it. Today I would like to talk about one of them called 'data-flow analysis'.

Data flow analysis - what is it about?



So what is it - it is a static code analysis in a sense that we don't run our software to get results, but what it is trying to do is to mimic potential paths through the code in search of some specific path or data patterns which are for some reasons interesting. When you think about it - this approach is much more powerful than just testing some of the paths - here we have all of them analyzed.

The idea is: let's assume we can collect all possible paths through our application and then let's define subset there which would collect 'something went wrong' paths. Extremely powerful idea - if entirely realizable, it would be equivalent to testing all possible inputs and conditions. And the world would be a different place, where software is cheap. And big part of software developers would be selling coffee in Starbucks.

Data flow analysis - how useful it is?



Unfortunately for software users and fortunately for software developers and currently selling coffee wanna-be-actors it is not entirely possible. Calculus required is too complicated, space of all possible paths and inputs too big to control. Does it mean it's useless? Not at all.

It's actually extremely useful and commonly used - the trick is to limit the 'all paths' set by imposing maximum path length, size of all possible within path transitions etc. And we still can get extremely valuable results - these algorithms know transitions or path segments which you usually don't anticipate - like for simple setups which lead to raising exception from standard functions, rare paths which lead to leaked memory and resources, weird user scenarios which lead to pumping your collections with excess of data. Running your code through such algorithms is actually an eye-opening experience - there is so much you haven't anticipated getting your 'man of an hour creativity reward' earlier this day.

And remember - no matter how weird and rare these paths seem to be, these are the very ones people will follow as soon as they start using your software. Users are vicious when it comes to using our software - they don't add '2' and '2', they just keep adding whatever they fancy with no respect to the inputs we test it against. Unless you know how to change this behavior, data flow analysis can help you 'prove' your software (in a limited but still powerful way).

Use it.