Tuesday, 16 June 2009

C++: #include <rules>

My blood is boiling. I'm seething. I'm going to go mad. Is that an overstatement? Possibly. But it's not far off...

You see, you can get so used to doing things the Right Way that when you stumble across someone doing it the Wrong Way it comes as quite a shock. And a frustration.

Lately I've been working on a C++ project with appalling include file discipline. It's embarrassingly bad. There is a well-known gentleman's agreement over include files. A #include <etiquette> if you like. Doesn't everyone know about this?

(Of course, many programmers will cite the fact that C and C++ require such "good practice", rather than ENFORCE it, as a weakness of the language. Perhaps they are correct, but that's a different story.)

The most basic of the #include <rules> are:

1. A header file must be self-contained and complete.

#include-ing it CANNOT produce a build error.

It does not require you to include more files first. Anything it references is either forward declared (if possible) or #included in that file.

If this is not the case, the user has to jump through innumerable hoops to work out exactly where the undefined bits of code live, and so what other files must be included first. Naturally, when those files do not include cleanly the task recurses painfully.

It's annoying. It's wrong. It makes extra work, reduces the code's self-understandability, and opens the door for errors (for example: the wrong files may be included; and it is possible to write header files that behave differently when different sets of include files are brought in beforehand).

2. Include files internally prevent problems from multiple inclusion

The canonical format for a header file is:
#ifndef SOME_WELL_CHOSEN_RANDOM_UNIQUE_NAME
#define SOME_WELL_CHOSEN_RANDOM_UNIQUE_NAME

// any required #includes go here

// header file contents go here

#endif
That "unique" name should be well chosen, and usually based on the name of the include file in question. It should also include the name of the project, and possibly the name of the subsystem, too, in order to avoid conflicts with header files you mnight be importing from a any third party libraries.

Many compilers provide #pragma once as a helpful way to write the same thing. However, this is NOT a standard C or C++ feature. It's cute, but it means that your code is not portable. Does this matter to you? It really ought to. The best advice is to used #pragma once AND the standard include guards together.

If you do not do this, then multiple includes of a header file will almost certainly generate build errors as the compiler sees re-definitions of the same code constructs.

Of course, some (very few, comparatively) headers are designed for multiple inclusion (e.g. by defining types based on some preset #define value that you establish prior to the #include). You CAN still define include guards for these headers using a little preprocessor ## macro string gluing.

Objective-C provides an interesting alternative to #include guards or #pragma once: the #import directive. This states at the include site that if the file has already been included, do not include it again. Otherwise go ahead and include it now.

It's cute, but it is just plain wrong. The calling site is NOT the place to specify if a file should be included one time only. This is a part of the contract provided and required by the include file, and so should be stated and enforced there. Also, import and include can be freely interchanged on the same files. The user should not be able to break the contract by accidentally including rather than importing the header.

Precompiled header files are another source of weirdness that I don't have the time to moan about properly here. They bring their own set of potential misuses.

3. Define things in ONE FILE ONLY

Do not have multiple #include files with different declarations of the same name.

This is a violation of the ODR (One Definition Rule). If the files define specific variations that are needed for different build configurations or targets, then still put them all in the same file. DO NOT leave it up to the #include-r to work out which of the files to use themselves. Any silly #ifdef determination logic should be INTERNAL to the header file, not external at the include-site.

It's obvious what will happen if you ignore this rule. Someone will accidentally include the wrong header version somewhere, and cause errors. Probably subtle, odd, and hard to track down runtime errors, too.

4. Differentiate "public" interface header files from implementation files

Many projects contain subcomponents with a small number of "public" header files defining their interface, and many internal .h files for implementation classes.

Differentiate these files.

This should be done by placing them in very different file locations. This will prevent a subcomponent interface user from accidentally including an internal header and using it as if it's part of the public interface.



There are more good practice rules than those. But these are the most basic and important ones.

The project I've been working on breaks these rules all over the place and it makes working with the code really, really complicated. Yes, I'm moaning. But we should know better than this by now. And we really should have developed sharper standard tools for this by now.

The greater the size of the project, the harder these problems are to fix. I'd love to dive in and sort the whole thing out in the current project, but I fear it will take a very, very long time. It's particularly hard as it's a large codebase that builds on several platforms with a number of compilers.

4 comments:

rogero said...

I share your pain...
it seems so obvious I can't understand why otherwise sane developers get this wrong :-)

the_exile said...

I feel your pain too.

I wonder what your subconscious was thinking when you typed ODF!

I was musing the other day (as I was extolling the virtues of forward declarations) on the megalithic header that includes all the other headers in a magic order and then is included by each source file. Why was that ever thought to be a good idea? I suspect Microsoft must take some blame with stdafx, but I've seen it in plenty of non VC++ code over the years. Not recently though - if I did I suspect my blood would start to boil too.

Pete Goodliffe said...

Heh. I have open office on the brain, clearly! (Just been writing my next column).

Don't get me started on stdafx. Just don't.

We have a stdafx that doesn't compile by itself. Sigh.

Hitesh Kumar said...

Header file in c++
Great keep sharing