Numerical error when converting array to list

341 Views Asked by At

I'm doing quite a bit of scientific numerical integration in Python, using Numpy and ode. I use several arrays, and I wanted to turn a 1d array into a list for exporting and easier manipulation. Since then I've found easier and more pythonic methods without resorting to lists, but before that I have stumbled across this weird unwanted behaviour that I could not find an explanation for. I've snipped the arrays in the results but this is the entirety of the code.

If I create a range array using arange

a=numpy.arange(1,20,0.2)

the result is

array([ 1. , 1.2, 1.4, 1.6, ... , 19.4, 19.6, 19.8])

But if one uses the list() method as in

list(a)

suddenly this turns into

[1.0, 1.2, 1.3999999999999999, 1.5999999999999999, ..., 19.399999999999995, 19.599999999999994, 19.799999999999997]

This is a huge error. I'm not a specialist, so I can't tell exactly how it happens. I've shown this to a person that is very knowledgeable about the specific subject of number handling in computers and they told me it's quite curious, but could not immediately identify the specific problem beside saying

Well, it's not really an error, because 0.2 can't be represented exactly, but is almost certainly not intended! All that is the case is that those two methods are not last-bit compatible

I've searched the Internet without much success, both searching for this exact error and using keywords extracted from the above explanation. So I've decided to ask here

Has anyone stumbled on this sort of error before? Does anyone have an in-depth explanation of what causes it?

I'm using Python 2.7.9, inside an Anaconda environment on a CentOS 6.6 machine.

Edit: Obviously Stack Exchange is more clever than I am and suggests a relevant question only after I've written down the whole thing Python - Converting an array to a list causes values to change

2

There are 2 best solutions below

1
On BEST ANSWER

No numerical errors are being introduced when you convert the array to a list, it's simply a difference in how the floating values are represented in lists and arrays.

Calling list(a) means you get a list of the NumPy float types (not Python float objects). When printed, the shell prints more digits of the float value. NumPy arrays by default only print up to 1 decimal place of the float.

If you set the precision of NumPy arrays higher, you'll see the same values as you get in your list:

>>> np.set_printoptions(precision=16)
>>> a = np.arange(1, 20, 0.2)
>>> a[:10]
array([ 1.                ,  1.2               ,  1.3999999999999999,
        1.5999999999999999,  1.7999999999999998,  1.9999999999999998,
        2.1999999999999997,  2.3999999999999995,  2.5999999999999996,
        2.7999999999999998])

>>> list(a[:10])
[1.0,
 1.2,
 1.3999999999999999,
 1.5999999999999999,
 1.7999999999999998,
 1.9999999999999998,
 2.1999999999999997,
 2.3999999999999995,
 2.5999999999999996,
 2.7999999999999998]

It's a similar story with a.tolist(). Here, the NumPy float datatype is converted to a Python float object for the list (float objects are just C doubles internally, like NumPy). Both types approximate the values to the same precision and have the same representation quirks.


Incidentally, it's worth mentioning linspace for generating these types of ranges. You'll still see "inaccuracies" in the float representations, but unlike arange you can be sure that this function returns the endpoint exactly:

>>> np.linspace(1, 20, 96)
array([  1.                ,   1.2               ,   1.3999999999999999,
         1.6000000000000001,   1.8               ,   2.                ,
         ...
         19.                ,  19.1999999999999993,  19.4000000000000021,
         19.6000000000000014,  19.8000000000000007,  20.                ])
0
On

If you print the individual contents of the array using repr you will see the exact same output:

import numpy

a = numpy.arange(1, 20, 0.2)

print(repr(a[2]))
1.3999999999999999

Or use an ipython shell:

n [2]: a = numpy.arange(1, 20, 0.2)

In [3]: a[3]
Out[3]: 1.5999999999999999

In [4]: a[2]
Out[4]: 1.3999999999999999

In [5]: a[4]
Out[5]: 1.7999999999999998

A python list shows the repr representation of the objects stored in the list, numpy like print in python formats the output to a certain precision.