PyTorch中的计算图（附带实例）

PyTorch 中的计算图是一种图形结构，用于表示计算过程的依赖关系。

在 PyTorch 中，计算图是通过自动微分（Autograd）实现的。Autograd 跟踪所有在张量上执行的操作，并构建计算图，同时计算操作的梯度，以便在反向传播过程中更新参数。

PyTorch 张量表示计算图中的一个节点。如果 x 是一个张量，其 x.requires_grad=True，那么 x.grad 是另一个张量相对于某个标量值保持 x 的梯度。

下面是一个使用 PyTorch 构建计算图的示例代码：

import torch

# 定义输入张量
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)

# 定义权重张量
w = torch.tensor([[5.0, 6.0], [7.0, 8.0]], requires_grad=True)

# 定义偏置张量
b = torch.tensor([[9.0, 10.0]], requires_grad=True)

# 定义计算图
y = torch.matmul(x, w) + b
z = torch.sum(y)

# 计算梯度
z.backward()

# 输出梯度
print(x.grad)
print(w.grad)
print(b.grad)

在 PyTorch 中，每个张量都有一个 grad_fn 属性，该属性记录了创建该张量的操作，也称为梯度函数。梯度函数是用于计算张量的梯度（或导数）的函数，其根据链式法则将梯度向后传播到计算图中的先前计算节点，例如：

import torch

# 创建一个指定形状的张量，并将其所有元素初始化为1
x = torch.ones(3, 2, requires_grad=True)
print(x)
print(x.grad_fn)  # 输出:None

当在计算图上执行反向传播时，PyTorch 使用每个张量的 grad_fn 属性来构造反向传播图。这样，每个张量都知道如何将其梯度向后传播到先前的操作。

torch.no_grad() 是一个上下文管理器，它将禁用上下文中的所有梯度计算，例如：

import torch
import math

dtype = torch.float
device = torch.device("cpu")

# Create tensors to hold input and outputs
# As we don't need to compute gradients with respect to these Tensors, we can set
# requires_grad = False. This is also the default setting.
x = torch.linspace(-math.pi, math.pi, 2000)
y = torch.sin(x)

# Create random tensors for weights. For these Tensors, we require gradients,
# therefore, we can set requires_grad = True
a = torch.randn((), device=device, dtype=dtype, requires_grad=True)
b = torch.randn((), device=device, dtype=dtype, requires_grad=True)
c = torch.randn((), device=device, dtype=dtype, requires_grad=True)
d = torch.randn((), device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6

# Forward pass: we compute predicted y using operations on Tensors.
y_pred = a + b * x + c * x ** 2 + d * x ** 3

# Compute and print loss using operations on Tensors.
# Now loss is a Tensor of shape (1,)
# loss.item() gets the scalar value held in the loss.
loss = (y_pred - y).pow(2).sum()

# Use autograd to compute the backward pass. This call will compute the
# gradient of loss with respect to all Tensors with requires_grad=True.
# After this call a.grad, b.grad, c.grad and d.grad will be Tensors holding
# the gradient of the loss with respect to a, b, c, d respectively.
loss.backward()

# Manually update weights using gradient descent. Wrap in torch.no_grad()
# because weights have requires_grad=True, but we don't need to track this
# in autograd.
with torch.no_grad():
    a -= learning_rate * a.grad
    b -= learning_rate * b.grad
    c -= learning_rate * c.grad
    d -= learning_rate * d.grad

    # Manually zero the gradients after updating weights
    a.grad = None
    b.grad = None
    c.grad = None
    d.grad = None

PyTorch中的计算图（附带实例）

相关文章