Understanding difference in behavior of square brackets in NumPy 2D array sub-setting

4.3k Views Asked by At

I am new to python and learning it from basics. I have a 2D array (npb)

npb=np.array([[1,2],
              [3,4],
              [5,6],
              [7,8]]);

When doing subsetting normally (without colon), then it gives output,

Input:       nph=np.array(npb[0][1])    
Output:      2

Input:       nph=np.array(npb[0 ,1])   
Output:      2

but when doing it with colon, it gives output

Input:       nph=np.array(npb[:][1])
Output:      3 ,4

Input:       nph=np.array(npb[: ,1])          
Output:      2 ,4, 6 ,8

i.e.,[0][1] and [0,1] gives same result whereas [:][1] and [:,1] doesnot. Why?

1

There are 1 best solutions below

0
Leo K On BEST ANSWER

The two ways of indexing, although similar-looking are fundamentally different, although they produce the same result when addressing a single element of the array.

npb[x][y] is interpreted by Python as (nbp[x])[y], that is: - get element x from npb, then get element y from the result of the former. So, with npb[0][1]: npb[0] is [1,2], and [1,2][1] is 2. Here, you're treating npb simply as a list of lists. With npb[:][1], Python sees (npb[:])[1], so: npb[:] is a copy of npb and [1] of that is the 2nd item, which is the list [3,4].

npb[x,y] is a special selector for numpy objects (and other similar things like dataframes) and it is read by Python as: get (x,y) from npb, where x says which row(s) to get and y - which column(s). Such a composite index isn't valid for most Python collection objects - it works only on things that are specially made to handle it, like numpy.array. Now (0,1) means row 0, column 1 - just happens to be the same as npb[0][1] that is 'element 1 from npb[0]', simply because of the way numpy stores 2-d arrays. However, (:,1) means all rows, column 1 - obviously not the same as 'element 1 from npb[:]' that you get with npb[:][1].