Robust Clojure: The best way to handle nil
Table of Contents
Large Clojure codebases can become nasty, just as in any other dynamic
language. Fortunately, Clojure isn't as problematic as some other
languages because it is partially inspired by ML. True, it doesn't have
static typing, but the way Clojure treats nil
allows us to get very
close to the ML way.
1. Maybe
In Haskell and other ML-ish languages, the Maybe
type represents
either Nothing
or Just <some_value>
. This becomes super useful when
you need to check if a thing exists and get the value of that thing at
the same time.
For example doing this explicitly in Clojure is cumbersome:
(def data {:a 1 :b 2}) (if (= nil (:a data)) 0 ; default return value (:a data))
Checking for nil
comes at a cost, of course. You're accessing the map
twice, and that's a lot of boilerplate code if you'll be doing this
often. In Haskell it's much cleaner:
-- let's pretend this is a function that Maybe returns an Int getSomeData :: Maybe Int -- call the function and handle the return value case getSomeData of Nothing -> Nothing Just a -> a
We can handle both the getting of the value and the returning of the value in one fell swoop.
Clojure has a function that does basically the same thing. The get
function will return nil
if a key in a dictionary isn't present:
(get {:a 1 :b 2} :a) ;=> 1 (get {:a 1 :b 2} :c) ;=> nil
This is similar to our imaginary getSomeData
function in the Haskell
snippet, except the Just a
is implicit, so we don't have to extract
the value 1
every time.
2. Maybe nil
?
Practically speaking, nil
is a value in Clojure because you can do
anything with it that you can do with any other value. This idea of
"everything is a value" (commonly expressed as "everything is data")
runs deep in the Lisp tradition and gives Lisp languages a lot of power.
But it also causes problems. Consider:
(+ 1 (get {:a 1 :b 2} :c))
This will evaluate to (+ 1 nil)
which is nonsensical and will raise an
error. You can't increase nothing by 1
—if you try to, you just end
up with more of nothing!
3. The Right Way
The simple fix is to check for nil
just like you would check for
Nothing
. Clojure provides the
if-some
function to
make this more concise:
(if-some [it (get {:a 1 :b 2} :c)] (+ 1 it) nil)
Which is more-or-less the same in Haskell:
case getSomeData of Nothing -> Nothing Just it -> 1 + it
If you remember to write all of your Clojure code like this, your
codebase will become much more robust to nil
-related errors!
To sum up: Always treat nil
as if it means Nothing
.
If you're an intermediate Clojure programmer, then you're probably
already familiar with if-let=/=if-some
and perhaps not impressed. The
big idea, however, is the treatment of nil
as a type, and not as a
value, which is a subtle but important point.
To avoid these errors once and for all, you need to stop thinking about
nil
as a value. Yes, that is how Clojure treats nil
, but that
doesn't mean that you, the programmer, must treat it as a value too. If
you come from Java or C, which represents the absence of a value as the
null
value, then you'll have to update your mental model.
Realize: the concept of absence refers to a type of thing, not a value.
While you are writing code, you should be thinking, "Is the type of this
thing always an int
, or could it be nil
?" When doing numerical or
statistical programming, you can probably guarantee that you'll have a
number type returning in your algorithms. However, when you start
working with networking, or databases, or certain Java libraries, you
often lose the guarantee that functions will return concrete values
(network is down, database exploded, etc.), and then you must think
about nil
.
4. Clojure Idioms and Category Theory
In both Haskell and Clojure, manually checking for nil=/=Nothing
becomes tedious very fast, especially when you are chaining lots of
functions together. However both languages have solutions for this:
Haskell has category theory, Clojure has idioms.
In Haskell, the "bind" operator is defined basically like this:
(>>=) m g = case m of Nothing -> Nothing -- if m is Nothing, just return Nothing Just x -> g x -- otherwise, call the function g on the extracted value
Extending the above example, we can call getSomeData
and increase it
by 1
with the following:
incIfEven :: Int -> Maybe Int incIfEven n = if n/2 == 0 then Just n+1 else Nothing getSomeData >>= incIfEven
Clojure has a similar idiom. We use some->>
to thread the map through
the rest of the functions. First extract a key if it exists, then lift
the value into a vector, so we can use all of the collection-related
functions on it. This allows us to filter
and map
over it to
transform the data as we see fit:
(some->> {:a 2 :b 3} :a vector (filter even?) (map inc) first) ;;=> 3 (some->> {:a 2 :b 3} :b vector (filter even?) (map inc) first) ;;=> nil (some->> {:a 2 :b 3} :c vector (filter even?) (map inc) first) ;;=> nil
Voilà! You get the compactness of Haskell, without the overhead of category theory :)
I kid! Category theory is great. The "bind" operator (>>=
) is very
similar to some->>
because they both take a value from one monad and
"shove" it into the next monad. ⊕ If you have no idea what a "monad" is,
replace "monad" with "thing" and re-read that sentence. In Haskell, the
monad is the Maybe
type; in Clojure, the monad is implicit in the
collection interface which is the unifying abstraction in the language.
5. Clojure's Most Under-Appreciated Function
On IRC, technomancy mentioned he was
surprised fnil
wasn't in this article. I admit that completely forgot
about fnil
, but it's extremely useful.
fnil
can be used in our example above like so:
(def safe-inc (fnil inc 0)) (safe-inc (get {:a 1 :b 2} :b)) ;=> 3 (safe-inc (get {:a 1 :b 2} :c)) ;=> 1
In the above snippet, safe-inc
is a function just like (+ 1 x)
in
the earlier example, except if x
is nil
, then safe-inc
will use
0
as a default value instead. More (better) examples are available at
ClojureDocs.
fnil
isn't talked about much in the Clojure community, but it is a
handy funciton. Use it whenever you aren't sure if a variable is nil
but you do know what the value should be. In fact, the entire problem
of nil
isn't discussed much at all, but it is a very important issue,
one that the Clojure community should be aware of. Hopefully this
article will at least make you aware of the problems with nil
, and
start you down the path of thinking critically about nil
on your own.
6. We Still Have Problems
The biggest problem is that this practice explicit nil
handling is a
convention—the only thing enforcing it is your habits, and we all
know that we mere humans are fallible. Haskell's approach to Nothing
is thus superior because the compiler checks your work automatically,
which is nice.
A second problem is nil
itself, which is a problem with any
dynamically-typed language. Unforeseen =nil=s can bubble up the stack
and cause a lot of headache. One solution is to use a monad library
(discussed below), but more often than not, in everyday Clojure code, a
monad library is unnecessary.
The core problem is one of language design. Like I said above, Clojure
treats nil
as a value, when in reality, the concept of absence refers
to a type: intuitively, we say "absence of a value" just like we say
"an integer of 5". Clojure, as a lisp, made the choice to keep types an
evaluable construct, so they could be modified at runtime, instead of a
construct of compilation like Haskell. By choosing Clojure over Haskell,
you are choosing the power of metaprogramming, but with that comes the
drawbacks of dynamic typing.
The best solution to this that I've found (in dynamically-typed
languages) is to follow the single-responsibility principle: each
function should just do 1
thing. Then spec that function and catch the
possible nil
-causing inputs with auto-generated tests (this is an
article for another time). If you have other solutions, please email me
and I will add your contribution here :)
7. Other Solutions and nil
-Punning
As
described
by Skyliner, chaining if-let
's together like this is annoying:
(if-let [x (foo)] (if-let [y (bar x)] (if-let [z (goo x y)] (do (qux x y z) (log "it worked") true) (do (log "goo failed") false)) (do (log "bar failed") false)) (do (log "foo failed") false))
It's only mildly less annoying when using cats:
(require '[cats.core :as m]) (require '[cats.monad.either :as either]) @(m/mlet [x (if-let [v (foo)] (either/right v) (either/left)) y (if-let [v (bar x)] (either/right v) (either/left)) z (if-let [v (goo x y)] (either/right v) (either/left))] (m/return (qux x y z)))
The benefit with cats is you get fine-grained error handling for each
left
. Read more about cats error handling and the
Either type.
If some->
is out of the question, then personally I prefer the pattern
matching approach:
(match (foo) ;; pretend `foo` is function that returns a map nil (log "foo failed") {:ms t :user u :data data} (do (log "User: " u " took " t " seconds.") data))
The benefit is mostly the same as with if-let
, but you can pattern
match on the return value and then jump right into the next function,
which I find myself doing quite a lot.
Of course you can always tighten this up by defining your own version of
"bind" or some->
in Clojure:
(defn >>= [m g] (if-let [x (m)] (g x) (do (log (str (name m) " failed") nil))))
This is a (very) naïve implementation, but you get the idea. Modify to fit your use-case.
On Reddit, tolitius suggested the use of get
's optional third argument (which
I had forgotten about!) and or
:
get
has a default value built in:user=> (get {:a 1 :b 2} :b) 2 user=> (get {:a 1 :b 2} :c 0) 0hence
user=> (-> (get {:a 1 :b 2} :c 0) inc) 1In case this is a single op, such as
inc
, this would work as well:user=> (-> (or nil 41) inc) 42 user=> (-> :c {:a 1 :b 2} (or 41) inc) 42i.e.
or
is really handy for default values
Over at Lispcast, Eric Normand argues for the "nil
-punning" approach, which is
fine. But I think this approach requires a confused notion of what
nil=/=Nothing
actually means. According to Eric, nil
is a type, a value, a
key in a map, a boolean, an empty seq
. It seems to me that "nil
-punning" is
really just "nil
-confusion". It is much simpler to understand nil
as
Nothing
, i.e. the absence of a value (which is a type). That said,
nil
-punning in practice ends up mostly the same as I describe above, so either
technique will work.