Saturday, 7 March 2009

It's not unusual

They say that some people see the glass half full, some see it half empty. But most programmers don't see the glass at all; they write code that simply does not consider unusual situations. They are neither optimists not pessimists. They are not even realists. They're ignorists.

When writing your code don't consider only the thread of execution you expect to happen. At every step consider all of the unusual things that might occur, no matter how unlikely you think they'll be.

For example:

Errors

Any function you make a call to may not work as you expect.
  • If you are lucky, it will return an error code to signal this. If so, you should check that value; never ignore it.
  • The function might throw an exception if it cannot honour its contract. In this case, ensure that your code will cope with an exception bubbling up through it. Should you catch the exception and handle it, or allow it to pass further up the call stack? If you allow it to rise upwards does your function leak resources or leave the program in an invalid state in the process?
  • Or the function might return no indication of failure, but silently not do what you expected. You ask a function to print a message; will it always print it? Might it sometimes fail and consume the message.
Always consider errors that you can recover from, and write that recovery code. Consider also the errors that you cannot recover from; the storage failures or running out of critical resources. Even if you can't recover in these circumstances, write your code to do the best thing possible; don't just ignore it.

Threading

Too many programmers try to work in an idealistic single-threaded bliss. However, the world has moved around them to a complex, highly threaded environment. Reasoning about what might happen in your code in the presence of multiple threads is much harder.

Unusual interactions between pieces of code are staple here, and it's hard to enumerate every possible interweaving of code paths, let alone reproduce one particular problematic interaction more than once.

To tame this level of unpredictability, make sure you understand basic concurrency principles, how to ensure mutual exclusion in code blocks and how to decouple threads so they cannot interact in dangerous ways.

Understand mechanisms for the safe creation and destruction of objects in threads, and the correct primitives to reliably and quickly pass messages between thread contexts without introducing race conditions or blocking the threads unnecessarily.

Shutdown

We make great plans for how to construct our systems: how to create all the objects, how get all the plates to spin, and how to keep those objects running and those plates spinning.

Less attention is given to the other end of the lifecycle: how to bring the code to a graceful halt without leaking resources, locking up, or going wrong.

Shutting down your system and destroying all the objects is especially hard in a multi-threaded system. Objects that depend on inter-thread interactions can be particularly prone to subtle destruction problems: deadlock, dependency inversion and the like.

As your application shuts down and destroys its worker objects, make sure you can't leave one object attempting to use another that has already been deleted. Don't enqueue threaded callbacks that target objects which have been deleted on another thread.

Make sure that you thoroughly test your code for destruction issues as well as construction issues. Test it in an environment where you can control threading issues with mock objects, but also cover it with integration tests that glue real parts of the system together — the particular timings and order of operations there may highlight other shutdown problems.

The moral of the story

The unexpected is not the unusual. You need to write your code in the light of this.

It's important to think about these issues early on in your code development. You can't tack this kind of correctness as an afterthought; the problems are insidious and run deeply into the grain of your code. Such demons are very hard to exorcise after the code has been fleshed out.

Writing good code is not about being an optimist or a pessimist. It's not about how much water is in the glass right now. It's about making a water-tight glass so that there will be no spillages, no matter how much water the glass contains.

No comments: