Writing and reading code
You’ve probably heard that a developer of an established software project writes an average of 100 lines of code (loc) a day. I can say that this applies to me. So if you write 100 loc per day, how many do you read? I would estimate that the amount of time you spend on reading and understanding code is significantly more than on writing code. You probably also spend quite some time debugging code.
If you spend so much more time on reading and debugging code than writing code, shouldn’t you put more effort in writing clean and debuggable code? The Samba codebase is pretty old, more than 15 years now. I would say we have some experience with bad code and we have started to write much better and cleaner code, because we have wasted so much time trying to understand and debug code. However there is still room for improvement. Lets take a look at the following C code snippet.
if (!a) { return; } if (!b) { return; } if (!c) { return; } if (!*d) { return; }
Can you guess from the code above what types of variables a, b, c and d are? The answer is no? Ok, lets take a look at the following code:
if (a == NULL) { return; } if (b == 0) { return; } if (c == false) { return; } if (d[0] == '\0') { return; }
If you look at the code now, you can probably guess what types they are. Well not exactly which type, but in which superset they are. ‘a’ is a pointer, ‘b’ is an integer type, ‘c’ is a bool and ‘d’ is a string (char array). If you write code the way shown above, you don’t have to scroll up to find out as which type the variable is defined. Most of the time it is enough to know what type of superset you are checking and why.
Think about this: If you spend just a bit more time on writing clean code now, you will spend less time on reading, understanding and debugging the code later if you have to find a bug.
Lets look at some more best practices we do in the Samba code:
bool ok; int rc; /* * bool return codes should always have the name 'ok' or * start with 'is_' or 'do_' */ ok = fn_returning_a_bool(); if (!ok) { return; } /* We use rc or ret for an integer return code */ rc = do_something(); if (rc < 0) { return; }
You can see that we have variables for the return codes and check them with an if-clause. The reason is that it is easy on the eyes and in a debugger you can simply print the return code variable. If you write it like this: if (do_something() < 0) You have a hard time in the debugger to find out the actual return code. You have to step into and through the function to get it. We allow the !ok syntax for bool types, cause ok is by definition in our code a bool.
To be continued ...
Can you guess from the code above what types of variables a, b, c and d are? The answer is no?
The answer is “why should I care?”.
Better yet: why the hell the variables are named “a”, “b” and “c”?
if (c == false)
it’s not clear enough: you should write
if ((c == false) == true)
or, maybe
if (((c == false) == true) == true)
I always wish there were a compiler switch for C and C++ compilers to enforce real boolean conditions.
Then if (!a) would always mean that a is bool, if (pointerVar) would result in a compiler warning or error.
I think the code would be even more readable if the variables were declared as they are used. Then you can also easily check that all variables are initialized before they are used, find which ones are unused, etc.
/* bool return codes should always have the name ‘ok’ or start with ‘is_’ or ‘do_’ */
bool ok = fn_returning_a_bool();
if (!ok) {
return;
}
/* We use rc or ret for an integer return code */
int rc = do_something();
if (rc < 0) {
return;
}
“If you write code the way shown above, you don’t have to scroll up to find out as which type the variable is defined.”
If you have to scroll to find the type of a variable, you have already done two mistakes
1.) The function is too long (there are only very rare exceptions of functions which do not fit on a decent monitor).
2.) You have choosen the wrong IDE/editor, as it lacks a serious feature which helps reducing the error rate.
The fact that C allows for something else than bool in a condition is very unfortunate, thanks for giving a list of all the mess. The other three examples improve the code’s quality, it’s only the bool type, where the extra comparison does more harm than good.
Having “true == false” and “false == false” does not seem to bring any advantage in comparison with “true” and “false”. This is, what “c == false” and “c” have to be translated to by the human who wants to understand the code. If you do not have to translate it, your brain has already learned a pattern, but even than it is easier to learn a simpler pattern.
Code should be easily readable without having to use an IDE. It might be the case that you look at code in a web browser, cause someone posted you a snippet on a pastebin or you should review a patch (gitweb).
I think mixed code is horrible. You’re in the middle of reading code. You wonder what the type of the variable is and you have to look at it. You go to the beginning of the function and you don’t find it there. If variable definitions are all at the same place they are easier to find. If you have to spot them at a random location in the code it drives you crazy after some time.
int rc = do_something();
where is “do_something()” defined?
One thing i hate when browsing a foreign code base is to look at a code line like above,get curious at what the function does and have no idea of where to go look for the function.
It would be nice if the above line would be something like
/*
* do_something() is defined in ../../foo/bar.c
*/
int rc = do_something() ;
With above,i get to know where the function is defined and can easily go to where the function definition to look at its implementation if i care to know about it.
Adding a comment where the function is defined is really superfluous. Most IDEs support it and there are tools like ctags and cscope for exactly this task.
I though the point of the article was to easily parse code without using and IDE or related tools.To dismiss a comment on where a function is defined because “Most IDEs support it and there are tools like ctags and cscope for exactly this task” seem contradicting to me to the general premise of your article
If you are using an IDE,then most IDE will tell you the type of the variable when you mouse over them making your entire post moot.
If you comment each function call with the information where the function is declared the code will be harder to read again.
This is about writing good code not commenting code.
I agree that the four
if
statements are hard to read. But IMHO, the main problem is that the variable namesa, b, c, d
are not very meaningful. If you replace them by something more descriptive, you don’t need “real boolean coditions” to guess what type they have.This is an example, the variable names are irrelevant here. Variable names are a different topic.