I've started learning Python (python 3.3) and I was trying out the
is operator. I tried this:
>>> b = 'is it the space?' >>> a = 'is it the space?' >>> a is b False >>> c = 'isitthespace' >>> d = 'isitthespace' >>> c is d True >>> e = 'isitthespace?' >>> f = 'isitthespace?' >>> e is f False
It seems like the space and the question mark make the
is behave differently. What's going on?
EDIT: I know I should be using
==, I just wanted to know why
is behaves like this.
is operator relies on the
id function, which is
guaranteed to be unique among simultaneously existing objects. Specifically,
id returns the object's memory address. It seems that CPython has consistent memory addresses for strings containing only characters a-z and A-Z.
However, this seems to only be the case when the string has been assigned to a variable:
Here, the id of "foo" and the id of
a are the same.
a has been set to "foo" prior to checking the id.
>>> a = "foo" >>> id(a) 4322269384 >>> id("foo") 4322269384
However, the id of "bar" and the id of
a are different when checking the id of "bar" prior to setting
a equal to "bar".
>>> id("bar") 4322269224 >>> a = "bar" >>> id(a) 4322268984
Checking the id of "bar" again after setting
a equal to "bar" returns the same id.
>>> id("bar") 4322268984
So it seems that cPython keeps consistent memory addresses for strings containing only a-zA-Z when those strings are assigned to a variable. It's also entirely possible that this is version dependent: I'm running python 2.7.3 on a macbook. Others might get entirely different results.
In fact your code amounts to comparing objects id (i.e. their physical address). So instead of your is comparison:
>>> b = 'is it the space?' >>> a = 'is it the space?' >>> a is b False
You can do:
>>> id(a) == id(b) False
But, note that if a and b were directly in the comparison it would work.
>>> id('is it the space?') == id('is it the space?') True
In fact, in an expression there's sharing between the same static strings. But, at the program scale there's only sharing for word-like strings (so neither spaces nor punctuations).
You should not rely on this behavior as it's not documented anywhere and is a detail of implementation.