# **Deep learning for image analysis with PyTorch**

#### Fernando Cervantes, Systems Analyst I, Imaging Solutions, Research IT
#### fernando.cervantes@jax.org    (slack) @fernando.cervantes

Use ssh, or create a tunnel using MobaXTerm, or Putty to connect to the GCP<br>
**ssh -nNfg -L8888:computenodename:8080 student-##@###.###.###.### <br>**
_To be used only during the workshop. To login into JAX HPC use ssh as usual_<br>

Run the singularity container using the following command:<br>
**singularity run --nv --env CUDA_VISIBLE_DEVICES=0 --bind /fastscratch/data/:/mnt/data/:ro,/fastscratch/models/:/mnt/models/:ro /fastscratch/pytorch_jupyter.sif -m jupyterlab --no-browser --ip=$(hostname -i)**<br>
- **--nv** tells Singularity to use the NVIDIA drivers and allows us to use the GPUs inside the container
- **--env CUDA_VISIBLE_DEVICES=0** sets an environment variable that specifies what GPU device is going to be used by PyTorch
- **--bind /fastscratch/data/:/mnt/data/:ro** bind the location of the datasets to be visible inside the container (under the path _/mnt/data/_)

Copy the URL and paste into the search bar of your browser.<br>
If jupyter asks for a password setup, use the token from the URL that you copied and use as password: **student-#**<br>
The token looks something like: http://some-ip:8888/lab?token= **A-long-alphanumeric-string**

## **2 Getting started with PyTorch**

### 2.1 _Tensors_

The PyTorch library in python is called _torch_

In [3]:
import torch

  from .autonotebook import tqdm as notebook_tqdm


PyTorch basic object is the tensor (multidimensional array), with default 32-bytes float datype.

In [10]:
x = torch.tensor(
    [[1., 0.], 
     [0., 1.]]
)

In [11]:
x

tensor([[1., 0.],
        [0., 1.]])

In [12]:
x.dtype

torch.float32

***
PyTorch has a from_numpy function to convert numpy arrays to tensors.<br>
The datatype and shape of the source numpy array are kept when converted to a pytorch tensor

In [15]:
import numpy as np

In [16]:
a = np.array([
        [[0., 1.],
         [1., 0.]],
        [[0., 2.],
         [2., 0.]]
    ])

In [17]:
a

array([[[0., 1.],
        [1., 0.]],

       [[0., 2.],
        [2., 0.]]])

In [18]:
type(a), a.dtype, a.shape

(numpy.ndarray, dtype('float64'), (2, 2, 2))

In [20]:
b = torch.from_numpy(a)

In [21]:
b

tensor([[[0., 1.],
         [1., 0.]],

        [[0., 2.],
         [2., 0.]]], dtype=torch.float64)

In [22]:
type(b), b.dtype, b.shape

(torch.Tensor, torch.float64, torch.Size([2, 2, 2]))

***
Tensors have built-in function to convert them to numpy arrays

In [23]:
c = b.numpy()

In [24]:
c

array([[[0., 1.],
        [1., 0.]],

       [[0., 2.],
        [2., 0.]]])

In [25]:
type(c), c.dtype, c.shape

(numpy.ndarray, dtype('float64'), (2, 2, 2))

***
In PyTorch, the shape of a tensor is retrieved using the built-in function _size_

In [26]:
print(b.size())
print(b.shape)

torch.Size([2, 2, 2])
torch.Size([2, 2, 2])


### 2.2 _Initializing tensors_

Like in numpy, tensors can be initialized by giving the desired shape and datatype

In [18]:
x = torch.zeros((2, 3, 1, 5))

In [19]:
x.size(), x.dtype

(torch.Size([2, 3, 1, 5]), torch.float32)

In [20]:
x

tensor([[[[0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.]]],


        [[[0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.]],

         [[0., 0., 0., 0., 0.]]]])

In [19]:
x = torch.ones((4, 1, 1, 5), dtype=torch.float64)

In [20]:
x.size(), x.dtype

(torch.Size([4, 1, 1, 5]), torch.float64)

In [21]:
x

tensor([[[[1., 1., 1., 1., 1.]]],


        [[[1., 1., 1., 1., 1.]]],


        [[[1., 1., 1., 1., 1.]]],


        [[[1., 1., 1., 1., 1.]]]], dtype=torch.float64)

### 2.3 _Operations on tensors_

There is a wide variety of arithmetic, linear algebra, and matix manipulation operations already implemented to be performed on tensors.

In [27]:
x = torch.tensor([7.])

In [28]:
2 * x + 3

tensor([17.])

Mathematical operations can be applied directly as build-in functions from tensor objects, or calling the torch library

In [30]:
x.cos()

tensor([0.7539])

In [31]:
torch.cos(x)

tensor([0.7539])

Most operations are applied _element-wise_ to each entry of the tensor

In [32]:
x = torch.zeros((2, 2))
x.cos()

tensor([[1., 1.],
        [1., 1.]])

### 2.4 _Random tensors_

PyTorch has a random number generator to create random tensors

In [27]:
x = torch.rand((2, 1, 3)) # Random numbers drawn from an uniform distribution in [0, 1]
x

tensor([[[0.0983, 0.8942, 0.6058]],

        [[0.2269, 0.7339, 0.2170]]])

In [28]:
x = torch.randn((5, 4, 3)) # Random numbers drawn from a normal distribution N(0, 1)
x

tensor([[[ 2.3141, -0.4175, -1.1024],
         [-2.0188,  0.3515,  1.2695],
         [ 0.4859, -0.1608, -1.1287],
         [-0.2285,  0.0831, -2.0106]],

        [[ 1.0596,  0.4435, -1.5947],
         [-0.9315, -1.8303,  0.5007],
         [ 0.6654, -0.1618,  0.4161],
         [-1.3791,  0.6069, -0.0822]],

        [[ 0.6490,  1.4369, -0.1554],
         [-1.2000, -0.4331,  1.2428],
         [ 0.5684,  1.1465,  2.1853],
         [-0.3433,  0.2023, -0.7520]],

        [[-1.4547,  0.5643,  0.9124],
         [ 0.8602,  0.9191, -1.3361],
         [-0.8196, -0.0540, -0.0645],
         [-0.1562, -1.1308,  0.5527]],

        [[ 0.0237,  0.6260,  0.8046],
         [-1.8154,  0.0738, -1.2893],
         [ 0.5735, -0.2795, -1.8737],
         [-0.1437,  0.7811,  0.4386]]])

For reproducibility, the seed for random number generation is set using torch.random.manual_seed

In [29]:
torch.random.manual_seed(777)
x = torch.rand(1)
x

tensor([0.0819])

In [30]:
x = torch.rand(1)
x

tensor([0.4911])

In [31]:
torch.random.manual_seed(777)
x = torch.rand(1)
x

tensor([0.0819])

In [32]:
x = torch.rand(1)
x

tensor([0.4911])

### 2.5 _Automatic differentiation (autograd)_

The autograd module of PyTorch allows to compute the gradient of _almost_ any operation on tensors that are implemented and applied using torch

In [39]:
x = torch.tensor(4.)
y = torch.tensor(5.)

In [40]:
z = 2*x + y + x * y
z

tensor(33.)

The autograd functionality of PyTorch is enabled when at least for one tensor, its gradient is required to be computed.<br>
That means that internally, a graph is generated to compute the gradients on the tensors.

In [4]:
x = torch.tensor(4., requires_grad=True)
y = torch.tensor(5., requires_grad=True)

In [5]:
z = 2*x + y + x * y

In [6]:
z

tensor(33., grad_fn=<AddBackward0>)

***

The gradient is computed when the _backward_ built-in function is called.<br>
This will compute the gradients of all involved tensors. Then the graph is destroyed to save memory (cannot call _backward_ twice)

In [7]:
z.backward()

We expect the following result for function $f$.<br>
$z = f(x, y) = 2 x + y + x y$<br>
$\frac{\delta f}{\delta x} = 2 + y$<br>
$\frac{\delta f}{\delta y} = 1 + x$<br>

 $\frac{\delta f}{\delta x}(x=4, y=5) = 2 + 5 = 7$<br>
 $\frac{\delta f}{\delta y}(x=4, y=5) = 1 + 4 = 5$

In [8]:
x.grad

tensor(7.)

In [52]:
y.grad

tensor(5.)

***
Example of a linear transform on $x$

In [51]:
x = torch.tensor([1., 2., 1., 2., 3.])  # input tensor
w = torch.randn(3, 5, requires_grad=True)
b = torch.randn(3, requires_grad=True)

In [52]:
z = torch.matmul(w, x)+b

In [53]:
z

tensor([-4.6712,  6.6213,  3.7979], grad_fn=<AddBackward0>)

z is of dimension 1$\times$3, so to compute the gradient on z, we need a tensor with shape 1$\times$3 

In [54]:
z.backward(torch.tensor([1., 1., 1.]))

In [55]:
w.grad

tensor([[1., 2., 1., 2., 3.],
        [1., 2., 1., 2., 3.],
        [1., 2., 1., 2., 3.]])

In [56]:
b.grad

tensor([1., 1., 1.])

***
Compute the gradient for a loss function in an optimization problem

In [58]:
w = torch.randn(3, 5, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(w, x)+b

For this example, lets use the Mean Squared Error (MSE) as target function

In [59]:
y = torch.zeros(3)  # target output, ground-truth

In [60]:
loss = torch.mean((y - z) ** 2)

In [61]:
loss

tensor(8.5013, grad_fn=<MeanBackward0>)

In [62]:
loss.backward()

In [63]:
w.grad

tensor([[ 3.2374,  6.4749,  3.2374,  6.4749,  9.7123],
        [-0.7039, -1.4077, -0.7039, -1.4077, -2.1116],
        [ 0.5989,  1.1977,  0.5989,  1.1977,  1.7966]])

In [64]:
b.grad

tensor([ 3.2374, -0.7039,  0.5989])

In an optimization step (e.g. gradient descent), the new values for **w** and **b** are updated from the computed gradient

### 2.6 _Loss functions_

PyTorch has implemented several loss functions that are ready for use.<br>
These can be found in the *nn* (neural networks) module.<br>
In most of the cases, these functions have some level of optimization in their code.

In [90]:
import torch.nn as nn

In [91]:
nn.MSELoss

torch.nn.modules.loss.MSELoss

In [92]:
criterion = nn.MSELoss()
loss = criterion(y, z)
loss

tensor(4.5112, grad_fn=<MseLossBackward0>)

In [93]:
nn.CrossEntropyLoss

torch.nn.modules.loss.CrossEntropyLoss

In [94]:
nn.BCEWithLogitsLoss

torch.nn.modules.loss.BCEWithLogitsLoss

The complete list of loss functions can be found in this [link](https://pytorch.org/docs/stable/nn.html#loss-functions)