The Python Tourist #4: None, empty, and nothing.
In learning Python, I had read in several places that you should always use a test like
if obj is None if you wanted to check for the None value. For some reason, I tend to ignore blanket statements that are presented without supporting rationale. If the underlying rationale isn't stated, I generally assume it's some sort of esoteric thing that doesn't really matter. Only after it bites me and I can understand the logic will I pay attention.
Here are a couple of cases where not explicitly testing for
None has gotten me into trouble. Maybe these can help someone else avoid the same headaches.
Look at this sample:
Simple parsing function
def parse_file(filename):
"""
Parse a file, returning a list of tags.
Returns None on error.
"""
f = open(filename,'r')
if not check_format(f):
return None # file is wrong format
tags = []
for line in f:
tags.append( parse_line(line) )
return tags
if parse_file(filename):
print "Parsed OK!"
else:
print "** ERROR **"
Looks correct enough. The
if parse_file(...) should be
True if I get a list, and the
else should be
True if I get
None. There is one little problem though. Look at the following snippet:
The boolean value of 'empty'
if []: print "True"
if not []: print "False"
This will always print
"False". Coming from a C background, I want to think of
None as "the absence of something", like a
NULL pointer. Unfortunately, Python treats empty objects as
False values. To me, an empty object is still something as opposed to
None which (I think) should be nothing, so this is confusing.
I think the thing to do is recognize that this function has three exit states:
- None, indicating an error.
- An empty list, indicating an empty file.
- A non-empty list, holding tags.
The correct test is then:
Explicitly test for None
tags = parse_file(filename)
if tags is None:
print "** ERROR **"
elif len(tags) == 0:
print "Empty file"
else:
print "OK!"
We can make this worse and give it four exit states, with the same functionality:
Now with four exit states ...
def parse_file(filename):
"""
Parse a file, returning a list of tags.
Returns None on error.
"""
f = open(filename,'r')
if not check_format(f):
return None # file is wrong format
tags = []
for line in f:
# look for special end-of-file tag
if end_of_file(l):
return tags
else:
tags.append( parse_line(line) )
Although it looks like the same logic, I've introduced a (sort of) hidden fourth state: If the "end-of-file" tag isn't found, the for loop will exit without returning a value. When you don't return a value,
None is returned. For example:
Not returning a value == None
def foo():
pass
print "The value is %s" % foo()
Prints "The value is None".
Of course, the code sample above is buggy, I shouldn't let it fall out of the loop. Once again, my C background gave me a false sense of security. A C compiler will tell you when you exit a routine in different ways (with and without a return value), so things like this won't happen if you pay attention to the compiler warnings. The dynamic nature of Python means that it really can't do that kind of checking, since it would be impractical to run through every branch inside the function to see if the return values match.
Anyways, disregarding the buggy code for the moment, recognize that the above function has four distinct exit states:
- None, indicating an error.
- An empty list, indicating an empty file.
- A non-empty list, holding tags.
- None, indicating no return value.
The first and last cases bother me a little bit. I don't like that
None can have two meanings:
- The value None.
- The absence of a value.
In my "C thinking" of the first example, I was assuming
None meant "the absence of a value", so was surprised to find that an empty list was (apparently) the same as nothing. Of course, that isn't the case, it's just that an empty list evaluates to the same boolean value as
None.
I appreciate that Python is a practical language. An impractical language could "fix" this by forcing you to only use (exactly)
True or
False in boolean expressions. Python tends to loosen the rules as much as practical, without going overboard. (Some languages like perl go overboard in their coercion rules, which I think leads to even harder to understand code.). I wish that empty lists didn't evaluate to
False, but that's the way it is, so you just have to keep it in mind.
NOTE
Normally, if you don't like the way an object behaves, you can subclass it and override the behavior you don't like. In the case of boolean operators, there doesn't seem to be a way to do that. If
L is a list, the expression
if L: ... calls
L.__len__(). Therefore an empty list returns
0, which is
False. Trying to override this would break other list functionality. There is a draft proposal,
PEP 335: Overloadable Boolean Operators, but even this doesn't seem to allow you to override the case of
if L: ..., only the case
if not L: ....
One final note: The correct test for None is
if obj is None, not
if obj == None. The reason not to use
== is that an object can define its own
__eq__ function, and might implement
__eq__ in a way that would cause it to be equal to (even if not the same as)
None. The
"is" operator means "the same object", so is the more correct test here.