Numpy

Numpy provides us with multi-dimensional array objects which is useful for scientific computing. In numpy:

  • rank refers to the number of dimensions of the array
  • shape is a tuple of integers providing the size of the array in each dimension

[1,2,3] is a numpy object of rank 1 and shape 3

[[1,2], [4,5], [7,8]] is a numpy object of rank 3 and shape 2.

Creating a numpy object

The following provides different ways of creating numpy objects.

In [1]:
import numpy as np
a = np.array([1,2,3])
print(a)
print(a.shape)
[1 2 3]
(3,)
In [2]:
b = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(b)
print(b.shape)
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
(3, 4)
In [3]:
c = np.array([[1, 2, 3]]) # Note the extra [] 
print(c.shape)
(1, 3)

Numpy supports far more numerical types (15) compared to standard Python. We have int8, int16, int32, int64 data types in integer and float16, float32, float64 data types in float. The dtype command can be used to determine the data type.

In [4]:
a = np.array([8, 10, 12])
print(a.dtype)  
a1 = np.array([8.0, 10, 12])
print(a1.dtype)  
int32
float64

Similar to lists, the numpy array indices start from zero. For multi-dimensional arrays, the way you access the elements is different.

In [5]:
a = np.array([8, 10, 12])
a[1] = 11
print(a)
[ 8 11 12]
In [6]:
b = np.array([[8, 10, 12], [20, 22, 24]]) 
b[0,1] = 11 
print(b)
[[ 8 11 12]
 [20 22 24]]
In [7]:
print(b[0,1])
print(b[0])
11
[ 8 11 12]
In [8]:
c = np.array([[8, 10, 12]])
print(c[0,1]) #Note the difference
10

Creating Standard Arrays

In numpy, there are several functions available to create standard arrays.

In [9]:
c = np.zeros((2,3)) #Creates a (2,3) array of all zeros
print(c)
[[ 0.  0.  0.]
 [ 0.  0.  0.]]
In [10]:
d = np.ones((2,3)) #Creates a (2,3) array of all ones
print(d)
[[ 1.  1.  1.]
 [ 1.  1.  1.]]
In [11]:
e = np.eye(3) # Creates a (3,3) identity matrix
print(e)
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
In [12]:
a = np.random.random((3,3)) # Creats a (3,3) matrix of random numbers
print(a)
[[ 0.30828292  0.26595404  0.23381036]
 [ 0.04757049  0.96139397  0.27113196]
 [ 0.29115244  0.85000192  0.34198764]]

The arange() function is similar to range() function in lists. The reshape() and flatten() are other useful commands to adjust array dimensions.

In [13]:
a = np.arange(10) 
print(a)
[0 1 2 3 4 5 6 7 8 9]

There are several ways to apply the same function to numpy arrays. We will demonstrate that with reshape function below.

In [14]:
b = np.reshape(a, (2,5))
print(b)
[[0 1 2 3 4]
 [5 6 7 8 9]]
In [15]:
c = a.reshape(2,5)
print(c)
[[0 1 2 3 4]
 [5 6 7 8 9]]
In [16]:
d = b.flatten() 
print(d)
[0 1 2 3 4 5 6 7 8 9]
In [17]:
f = np.arange(20).reshape(4,5)
print(f)
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Transpose

Transpose switches the shape along dimensions. When you have more than two dimensions, need to specify a list of axis permutations. There are multiple ways to apply the transpose command.

In [18]:
a = np.arange(10).reshape((2,5))
print(a)
b = np.transpose(a) 
print(b)
[[0 1 2 3 4]
 [5 6 7 8 9]]
[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]
In [19]:
c = a.transpose()
print(c)
[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]
In [20]:
d = a.T
print(d)
[[0 5]
 [1 6]
 [2 7]
 [3 8]
 [4 9]]

Slicing

Subarrays can be created using array[start:stop:step size] format. Note that the sub arrays are created only until the stop-1 index. Any of start, stop, or step size can be left out.

In [21]:
a = np.arange(10) 
print(a[3:])
[3 4 5 6 7 8 9]
In [22]:
print(a[:2])
[0 1]
In [23]:
print(a[2:8:2])
[2 4 6]
In [24]:
print(a[3::-1])
[3 2 1 0]

Multi-dimensional slicing can be done using array[row slice, column slice] format.

In [25]:
a = np.arange(12) 
b = a.reshape(3,4)
print(b[1,:]) # prints second row
[4 5 6 7]
In [26]:
print(b[:,1:3]) # prints second, and third column
[[ 1  2]
 [ 5  6]
 [ 9 10]]

View vs Copy

Slicing returns views and not copies. Use copy() command to create separate copies. Observe carefully the output of the following codes.

In [27]:
a = np.arange(12) 
b = a.reshape(3,4)
print("b=", b)
c = b # using assignment operator
d = b[1,:] # using slicing 
e = b.copy() # using the copy command
b[1,0] = 14
print("b=", b)
print("c=", c)
print("d=", d)
print("e=", e)
b= [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
b= [[ 0  1  2  3]
 [14  5  6  7]
 [ 8  9 10 11]]
c= [[ 0  1  2  3]
 [14  5  6  7]
 [ 8  9 10 11]]
d= [14  5  6  7]
e= [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Observe carefully the output of the following codes.

In [28]:
a = np.arange(12) 
b = a.reshape(3,4)
print("b=", b)
c = b # using assignment operator
d = b[1,:] # using slicing 
e = b.copy() # using the copy command
c[1,0] = 15
print("b=", b)
print("c=", c)
print("d=", d)
print("e=", e)
b= [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
b= [[ 0  1  2  3]
 [15  5  6  7]
 [ 8  9 10 11]]
c= [[ 0  1  2  3]
 [15  5  6  7]
 [ 8  9 10 11]]
d= [15  5  6  7]
e= [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Array Operation: Concatenate

Concatenate function can be used to expand the array along various dimensions. The axis parameter controls the direction of concatenation with default axis = 0. Other useful commands are append, vstack, hstack, dstack, tile. Observe carefully the output of these codes.

In [29]:
a = np.arange(6).reshape((2,3))
print("a = " , a)
print("a.shape = ", a.shape)
a =  [[0 1 2]
 [3 4 5]]
a.shape =  (2, 3)
In [30]:
b = np.array([[4,5]])
print("b.shape = ", b.shape)
b.shape =  (1, 2)
In [31]:
c = np.array([[6,7,8]])
print("c.shape = ", c.shape)
c.shape =  (1, 3)
In [32]:
d = np.concatenate((a,b.T), axis=1)
print("d = ", d)
d =  [[0 1 2 4]
 [3 4 5 5]]
In [33]:
e = np.concatenate((a,c))
print(e)
[[0 1 2]
 [3 4 5]
 [6 7 8]]

Mathematical Operations

Note that in numpy, the traditional mathematical operations happen element wise and are not matrix operations like in matlab.

In [34]:
g = np.array([[1,2,3],[4,5,6],[7,8,9]])
In [35]:
h = np.array([[21,22,23],[24,25,26],[27,28,29]])
In [36]:
h+g
Out[36]:
array([[22, 24, 26],
       [28, 30, 32],
       [34, 36, 38]])
In [37]:
h-g
Out[37]:
array([[20, 20, 20],
       [20, 20, 20],
       [20, 20, 20]])
In [38]:
h*g
Out[38]:
array([[ 21,  44,  69],
       [ 96, 125, 156],
       [189, 224, 261]])
In [39]:
h/g
Out[39]:
array([[ 21.        ,  11.        ,   7.66666667],
       [  6.        ,   5.        ,   4.33333333],
       [  3.85714286,   3.5       ,   3.22222222]])

Use the dot function for matrix multiplication for two dimensional arrays. The dot function serves as inner product for single dimensional arrays. Note the difference between dot product and multiplication.

In [40]:
i = np.ones((3,3))
j = np.array([1,2,3])
print(i*j)
[[ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]]
In [41]:
print(i.dot(j))
[ 6.  6.  6.]

When array dimensions do not match, python often expands the dimensions of the smaller array. This is called broadcasting. Check Python broadcasting rules if this feature is useful. Observe carefully the output of the following codes.

In [42]:
a = np.array([1,2,3])
b = np.array([5])
print(a+b) 
[6 7 8]
In [43]:
a = np.array([[1,2,3],[11, 12, 13], [21, 22, 23]])
b = np.array([1, 0 , 0])
print(a + b) # 1 is added to first column
[[ 2  2  3]
 [12 12 13]
 [22 22 23]]

Numpy offers several useful mathematical functions.

In [44]:
a = np.array([[1,2],[3,4]])
a.sum()     
Out[44]:
10
In [45]:
a.mean()
Out[45]:
2.5
In [46]:
a.sum(axis=1)  
Out[46]:
array([3, 7])
In [47]:
print(np.sqrt(a))
[[ 1.          1.41421356]
 [ 1.73205081  2.        ]]

Vectorization

The biggest advantage of numpy is vectorization which enables you to write compact codes which may also be computationally efficient. Let us say your goal is to determine the following sum

$$ t = \sum_{x=0}^{10} 0.1x e^{0.1x} $$

In [48]:
import math as mt 
t = 0.0
for x in range(11):
    t = t + 0.1*x*mt.exp(0.1*x)  
print(t)
11.396101346876886

The above code can be written without for loop as follows.

In [49]:
import math as mt 
x = np.arange(0,1.1,0.1)
t = np.sum(x*np.exp(x))
print(t)
11.3961013469

For more on numpy, refer to the official numpy tutorial here