UPDATE!
Due to the amount and breadth of feedback I’ve received regarding this post, I had to write a follow-up to clear the air a bit. After reading this post, continue over to meta-Dear NULL.
To not be or not not to be
This is a quick post, I promise this will not take long.
Dear NULL,
You’ve certainly made a mess of things, haven’t you?
I would like to know what you really are, but then again, you really aren’t anything. And at the same time, you’re comparable to anything.
Show a little existentialism and do something with yourself.
Sincerely,
Everyone
I find it funny, in a tragic comedic sense, that we must study the semantics of a missing value and how it interacts with an actual value. Although irrelevant for this post, what I find even more funny is when mathematics (such as category theory) provides type constructors for missing values. As if we could actually amplify nothing!
Please note that I am not making a case against NULL, I’m just making some observations.
Well anyways, here is what I stumbled on today that sparked this post:
bool IsShirtBlue(Shirt s)
{
return s != null && s.ShirtColor == Shirt.Color.Blue;
}(I’ve replaced type names to protect myself the innocent)
Given a shirt, I will tell you if it is blue as long as I’ve been given a shirt. Hmmm.
Obviously if the function returns true, you have a blue shirt. What if this function returns false? In that case, your shirt is not blue. Or you do not have a shirt, therefore it cannot possibly be blue. Well which one is it?
Are you laughing yet? If not, answer this question:
Does bigfoot wear a size 32 men's shoe?
If you’re still not laughing, please remember that you must provide a Yes/No answer and either answer carries the implication that bigfoot does exist.
The general problem is that predicates work on subjects. In this example, the subject is Shirt and the predicate is Blue. On the surface it appears we always have a subject, but we do not. We simply have a reference to what could be a shirt. Our subject may not exist, therefore the predicate has nothing to modify (modify here is being used in a grammatical sense).
You might think I’m making a bigger deal out of this than it really is. You might be right. But that does not change the fact that NULL is nothing and anything at the same time. This is a paradox, and one that at least deserves some scrutiny.
The NULL anti-pattern
At this time I would like to introduce the NULL anti-pattern. It is quite simple really, is in the shape of a function, and it look something like this:
bool PredicateRegardingArgument (object p)
{
return p != null && Predicate(p);
}
There are three ingredients in this anti-pattern:
- The function returns a Boolean value.
- The function has some parameter, p, comparable to NULL.
- The function tests some instance value of p.
The reason I call this an anti-pattern is because the function makes no distinction between an argument exhibiting some behavior and the argument having a value that can exhibit the behavior. In other words, our predicate may not have a subject to modify.
It is not difficult to imagine a more dangerous scenario:
bool IsFileInUse(string path)
{
return path != null && blah blah blah
}(this is just arbitrary code)
When this function returns true it means the file is in use, and therefore must exist. But what does it mean if the function returns false? A file that exists might not be in use. But does a file that does not exist exhibit any type of interaction with its environment? I think not.
You could argue that we might simply skew our perspective a bit and accept the fact that a single Boolean value might carry more meaning than simply true or false. That’s too shallow an argument however and certainly is not a solution.
We could fix this problem by introducing another Boolean value, which I will call NULL. Ha, I crack myself up sometimes.
There are two reasonable solutions to this problem, perhaps more. Both solutions are quite obvious and you’ve more than likely thought of them already. The first is a simple exception:
bool IsShirtBlue (Shirt s)
{
if (s == null)
{
throw new ArgumentNullException("s");
}
// arbitrary
}
The second is an instance method:
class Shirt
{
bool IsShirtBlue()
{
// arbitrary
}
}
I never said this was going to be complicated.
A mountain out of a molehill
I have to imagine that bugs exist due to the NULL anti-pattern. This may be especially true in organization which routinely use multiple languages since there is not much consistency regarding how different languages treat NULL. If you know SQL, you know what I mean.
I might just be at the bottom of the molehill.
To avoid this: Never accept null. Never return null.
To help you: Don't use 'getters' - Tell don't ask, but sometimes ask. In this case shirt.isBlue() might work, or even shirt.isColour(Blue), but not shirt.getColour() == Blue.
shirt.isShirtBlue() is probably missing something?
null in java or whatever isn't like SQL's poorly implemented version of a three value logic system, its just a pain - so don't use it.
btw - exposing Maps has a lot of the same problems, so its best not to do that either.
Posted by: James | April 20, 2011 at 01:54 PM
"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965." - C.A.R. Hoare
http://qconlondon.com/london-2009/presentation/Null+References:+The+Billion+Dollar+Mistake
Posted by: Saul | April 20, 2011 at 03:16 PM
The problem you have in languages like e.g. Java or C is that there is no type for a reference which cannot hold null. Various languages have attempted to address this: Scala has Option types (which are a watered-down approach); Nice has better option types [1]; Fantom has nullable types [2]; Whiley has union types [3]; etc.
[1] http://nice.sourceforge.net/manual.html#id2536072
[2] http://fantom.org/sidewalk/topic/369
[3] http://whiley.org/guide/typing/flow-typing
Posted by: Dave | April 20, 2011 at 04:32 PM
@James - "Never accept null." Ha ha, nice one. "Tell don't ask, but sometimes ask." Seriously?
@Dave - C# also has nullable types, and they may prove useful in this scenario.
Posted by: Patrick Dewane | April 20, 2011 at 07:45 PM
How can a function "not accept" null, especially in languages like Oracle SQL where an empty string is effectively identical to null (go ahead, try to insert a zero length string and see what you get)? Seriously, refusal to process nulls means what? Throw an exception or (more likely) simply output another null?
Posted by: Allan Peda | April 20, 2011 at 07:56 PM
Just use something like Haskell's Maybe, or Scala's Option (which works similarly although isn't as great).
You use a type constructor Maybe and wrap it around your actual type. For example, 'Maybe Bool'. Then, you can either return 'Just True', 'Just False', or 'Nothing'. Maybe is a monad and an applicative functor, so you can do something like '(isFileInUse path >> useFile path) [less than sign]|[greater than sign] fail "file does not exist"'
You might want to fix your HTML stripper, by the way, there seems to be no way to escape less than and greater than signs.
Posted by: Devyn Cairns | April 20, 2011 at 08:18 PM
@Patrick - @Allan - Never accept null - If in your program you decide that you simply won't use nulls across class boundaries, then the problem kind of goes away. If you simply don't null check, then as soon as you get a null, it means a software defect. i.e. you don't accept them. As soon as you pass a null to a method, it will fail with a NPE, but thats OK, because it now means a real bug. You now need to track down where the null came from, and make sure that whoever passed that null passes an object out instead, or maybe refactor your inter-object collaborations a bit.
Tell don't ask, but sometimes ask. I'm assuming you know of 'tell don't ask', if you are writing about OO! I gave an example of where you might want to ask... hence sometimes ask! - read http://pragprog.com/articles/tell-dont-ask http://www.growing-object-oriented-software.com/toc.html
@Allan - If you read what I wrote, then you will see that I said 'never return null', so not sure how you can interpret that to be 'simply output another null'.
There are lots of patterns you can use to help you, including polymorphism, 'tell don't ask', null object, and other stuff that mean this is totally doable.
Posted by: James | April 20, 2011 at 08:18 PM
@James - Please do not confuse my satire for anything more or less than just that. Your "but sometimes ask" comment was just funny when preceded by "don't ask", funny in a grammatical sense. Check out my follow-up post (I've updated this post with a link). As far as "this is totally doable", I'm afraid you may have missed my point. I just found something I though was funny and thought I would write about what I saw. Call it "observational humor". I appreciate your input!
Posted by: Patrick Dewane | April 20, 2011 at 08:45 PM
You're making things too complicated, for what reason I know not, perhaps to have something to write about.
For example, the first function IsShirtBlue you chose to interpret as "this is a shirt, is it blue?", and based on this interpretation you chose to discredit the notion of a NULL variable when it was supplied as an argument to the function accepting a shirt.
If you chose to interpret that function as "is this a blue shirt?" then your entire argument is moot. Supplying NULL would return false since it is not even a shirt, and if it is a shirt it would return whether it is blue or not.
When you construct a function you also design the contract under which that function is used. If you imply that it can not accept a NULL value because the function then wouldn't make sense, then don't blame NULL when you break that contract.
Posted by: Lukasz | April 21, 2011 at 11:32 PM
@Lukasz - Your concept of a function's contract is wrong. The contract represented by IsShirtBlue is (Shirt) -> (Bool). The name of the function is irrelevant. Supplying NULL to this function does not break its contract and that point was never made nor implied in my post.
The point is "IsShirtBlue" (or if you would prefer "IsThisABlueShirt", or if I prefer "ThisDoesNotMatter") is a vacuous truth when given a NULL shirt. In this case a return value of "false" will carry ambiguity since it does not differentiate between NULL and a shirt that is not blue.
Also, your implication about me making things complicated to have something to write about is quite wrong.
Posted by: Patrick Dewane | April 22, 2011 at 07:26 AM