Gradient Descent using Autograd
{% f(a, b) = (a-2)^2 + b^4 %}
which has a minimum at {% a=2 %} and {% b=0 %}.
Code
import torch
def model(a,b):
return (a-2)**2 + (b)**4
def iterate(args, rate=0.01):
# torch accumulates values into the gradient, so it needs to be zeroed out on each run
if args.grad is not None:
args.grad.zero_()
pass
loss1 = model(*args)
loss1.backward()
#turn of the grad computation in this block
with torch.no_grad():
args -= rate * args.grad
return args
args = torch.tensor([1.0,1.0], requires_grad=True)
for i in range(5000):
args = iterate(args)
print(args)
Code using an Optimizer
Instead of writing our own interator, we can rely on any number of optimizers provided by the torch library. In the following code, we instantiat the stochastic gradient descent optiomizer and utilize it in running the optimization
import torch.optim as op
args = torch.tensor([1.0,1.0], requires_grad=True)
#lr is the learning rate
opt = op.SGD([args], lr=0.001)
for i in range(5000):
if args.grad is not None:
args.grad.zero_()
loss1 = model(*args)
loss1.backward()
opt.step()
print(args)
A common powerful optimizer that is used often in machine learning is the Adam optimizer.
opt = op.Adam([args], lr=0.01)