Integer and String identity

suggest change

Python uses internal caching for a range of integers to reduce unnecessary overhead from their repeated creation.

In effect, this can lead to confusing behavior when comparing integer identities:

>>> -8 is (-7 - 1)
False
>>> -3 is (-2 - 1)
True

and, using another example:

>>> (255 + 1) is (255 + 1)
True
>>> (256 + 1) is (256 + 1)
False

Wait what?

We can see that the identity operation is yields True for some integers (-3, 256) but no for others (-8, 257).

To be more specific, integers in the range [-5, 256] are internally cached during interpreter startup and are only created once. As such, they are identical and comparing their identities with is yields True; integers outside this range are (usually) created on-the-fly and their identities compare to False.

This is a common pitfall since this is a common range for testing, but often enough, the code fails in the later staging process (or worse - production) with no apparent reason after working perfectly in development.

The solution is to always compare values using the equality (==) operator and not the identity (is) operator.


Python also keeps references to commonly used strings and can result in similarly confusing behavior when comparing identities (i.e. using is) of strings.

>>> 'python' is 'py' + 'thon'
True

The string 'python' is commonly used, so Python has one object that all references to the string 'python' use.

For uncommon strings, comparing identity fails even when the strings are equal.

>>> 'this is not a common string' is 'this is not' + ' a common string'
False
>>> 'this is not a common string' == 'this is not' + ' a common string'
True

So, just like the rule for Integers, always compare string values using the equality (==) operator and not the identity (is) operator.

Feedback about page:

Feedback:
Optional: your email if you want me to get back to you:


Common Pitfalls:
* Integer and String identity

Table Of Contents
2 Filter
3 List
7 Loops
22 Reduce
27 Classes
31 Set
42 Tuple
45 Enum
62 Sockets
89 urllib
92 Idioms
100 Common Pitfalls
104 Stack
105 Profiling
109 Logging
111 os module
118 Mixins
120 ArcPy
126 Arrays
132 2to3 tool
135 Unicode
138 Neo4j
140 Curses
141 Templates
145 heapq
146 tkinter
154 Audio
155 pyglet
157 ijson
160 Flask
161 Groupby
163 pygame
165 hashlib
166 Gzip
167 ctypes
185 pyaudio
186 shelve