Gradient Descent using Autograd

{% f(a, b) = (a-2)^2 + b^4 %}
which has a minimum at {% a=2 %} and {% b=0 %}.

Code

import torch def model(a,b): return (a-2)**2 + (b)**4 def iterate(args, rate=0.01): # torch accumulates values into the gradient, so it needs to be zeroed out on each run if args.grad is not None: args.grad.zero_() pass loss1 = model(*args) loss1.backward() #turn of the grad computation in this block with torch.no_grad(): args -= rate * args.grad return args args = torch.tensor([1.0,1.0], requires_grad=True) for i in range(5000): args = iterate(args) print(args)

Code using an Optimizer

Instead of writing our own interator, we can rely on any number of optimizers provided by the torch library. In the following code, we instantiat the stochastic gradient descent optiomizer and utilize it in running the optimization
import torch.optim as op args = torch.tensor([1.0,1.0], requires_grad=True) #lr is the learning rate opt = op.SGD([args], lr=0.001) for i in range(5000): if args.grad is not None: args.grad.zero_() loss1 = model(*args) loss1.backward() opt.step() print(args)
A common powerful optimizer that is used often in machine learning is the Adam optimizer.

opt = op.Adam([args], lr=0.01)