动手学pytorch-多层感知机

256乘以256乘以1000+1000+10=65546000
（inputs_numbers*hidden_numbers）+hidden_numbers乘以output_numbers

ReLU函数
ReLU（rectified linear unit）函数提供了一个很简单的非线性变换。给定元素x，该函数定义为

ReLU(x)=max(x,0).

``````%matplotlib inline
import torch
import numpy as np
import matplotlib.pyplot as plt
import sys
sys.path.append("/home/kesci/input")
import d2lzh1981 as d2l
print(torch.__version__)

def xyplot(x_vals, y_vals, name):
# d2l.set_figsize(figsize=(5, 2.5))
plt.plot(x_vals.detach().numpy(), y_vals.detach().numpy())
plt.xlabel('x')
plt.ylabel(name)

y = x.relu()
xyplot(x, y, 'relu')

Sigmoid函数
sigmoid函数可以将元素的值变换到0和1之间：

sigmoid(x)=1/1+exp⁡(−x).

sigmoid′(x)=sigmoid(x)(1−sigmoid(x)).

tanh函数
tanh（双曲正切）函数可以将元素的值变换到-1和1之间：

tanh(x)=1−exp⁡(−2x)/(1+exp⁡(−2x)).

tanh′(x)=1−tanh(x)**2.

ReLu函数是一个通用的激活函数，目前在大多数情况下使用。但是，ReLU函数只能在隐藏层中使用。

``````import torch
from torch import nn
from torch.nn import init
import numpy as np
import sys
sys.path.append("/home/kesci/input")
import d2lzh1981 as d2l

print(torch.__version__)
num_inputs, num_outputs, num_hiddens = 784, 10, 256

net = nn.Sequential(
d2l.FlattenLayer(),
nn.Linear(num_inputs, num_hiddens),
nn.ReLU(),
nn.Linear(num_hiddens, num_outputs),
)

for params in net.parameters():
init.normal_(params, mean=0, std=0.01)
batch_size = 256
batch_size, root='/home/kesci/input/FashionMNIST2065')
loss = torch.nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(net.parameters(), lr=0.5)

num_epochs = 5
d2l.train_ch3(
net,
train_iter,
test_iter,
loss,
num_epochs,
batch_size,
None,
None,
optimizer)
``````