tensorflow进阶笔记 --- #"4"# deeplearning.ai practice

tensorflow in practice

例子基本来自 Andrew Ng 的deeplearning.ai的作业。以下的每一个part对应的是他的不同的作业jupyter。我这里只是我自己的一个很简略的整理，大家如果需要从头过一遍并且理解更加深入的话，推荐直接去课程作业做一遍。每个part我给了个小标题是课程名。

最下面有课程链接。

PART 1

Convolution model - Step by Step - v1

其实这个part 1能学到的有限，之前看过《黄文坚，唐源 TensorFlow实战代码》这本书，里面有个结构化很好的mnist实例，当时入门tensorflow就靠了那个，很不错的基础代码。今天这一部分算是回顾一下。

# Example of a picture
index = 6
plt.imshow(X_train_orig[index])
print ("y = " + str(np.squeeze(Y_train_orig[:, index]))) # 这边的squeeze就是为了去掉维度变成一个常数

这里有个np.queeze的用法，原本Y_train_orig[:, index]得到的shape是(1,)，这个不是tensorflow的内容，但是我用这个开头还是不错的。

tf.placeholder()
整个网络的输入，可选数据类型有tf.float16,float32,float64

有了输入我们创建结构化的网络层需要tf.get_variable()，这个函数很有用，可以使得你的整个网络结构清晰，修改方便。

# GRADED FUNCTION: initialize_parameters

def initialize_parameters():
    """
    Initializes weight parameters to build a neural network with tensorflow. The shapes are:
                    W1 : [4, 4, 3, 8]
                    W2 : [2, 2, 8, 16]
    Returns:
    parameters -- a dictionary of tensors containing W1, W2
    """
    
    tf.set_random_seed(1)                              # so that your "random" numbers match ours
    
    ### START CODE HERE ### (approx. 2 lines of code)
    W1 = tf.get_variable("W1", [4, 4, 3, 8], initializer = tf.contrib.layers.xavier_initializer(seed = 0))
    W2 = tf.get_variable("W2", [2, 2, 8, 16], initializer = tf.contrib.layers.xavier_initializer(seed = 0))
    ### END CODE HERE ###

    parameters = {"W1": W1,
              "W2": W2}
    
    return parameters

initializer是一个调参的点，一般就是xavier，因为这个初始化方式在大多数试验中被证明是很有用的，当然也可以是别的，可以去研究看看。

下面是一系列的常用工具

tf.nn.conv2d(X,W1, strides = [1,s,s,1], padding = 'SAME'): given an input $X$ and a group of filters $W1$, this function convolves $W1$'s filters on X. The third input ([1,f,f,1]) represents the strides for each dimension of the input (m, n_H_prev, n_W_prev, n_C_prev). You can read the full documentation here
tf.nn.max_pool(A, ksize = [1,f,f,1], strides = [1,s,s,1], padding = 'SAME'): given an input A, this function uses a window of size (f, f) and strides of size (s, s) to carry out max pooling over each window. You can read the full documentation here
tf.nn.relu(Z1): computes the elementwise ReLU of Z1 (which can be any shape). You can read the full documentation here.
tf.contrib.layers.flatten(P): given an input P, this function flattens each example into a 1D vector it while maintaining the batch-size. It returns a flattened tensor with shape [batch_size, k]. You can read the full documentation here.
tf.contrib.layers.fully_connected(F, num_outputs): given a the flattened input F, it returns the output computed using a fully connected layer. You can read the full documentation here.

有个值得注意的地方，tf.contrib.layers.fully_connected 的 activation_fn 默认是reLu，我们需要改成我们需要，也许我们需要保持线性激活，那就设置成None。

构造好了网络，下面就是设计loss。一般来讲，loss最简单的就是MSE（均方根）,但是通常用的最多的还是交叉熵，交叉熵的公式值得去推导学习一下。之前玩了image caption 的比赛，里面我们用的是 sparse_softmax_cross_entropy_with_logits ，还能帮你计算one_hot。因为我们当时的y是某个单词，所以需要做one_hot处理。

cost = tf.nn.softmax_cross_entropy_with_logits(logits = Z3, labels = Y)
cost = tf.reduce_mean(cost)

上面的都ok了以后加上mini_batch就可以去实现完整的训练了。

PART 2

Convolution model - Step by Step - v2

这篇主要是怎么去从细节实现conv，所以和tensorflow会稍微没点关系。了解细节很重要！所以仔细学习。这一部分主要学习的是numpy的使用。

卷积层的知识点我不去补充了，当大家知道，padding是我们最开始需要懂的一个知识点。

这里需要 np.pad

看api解释理解了一会才懂，第二个参数是左右边要pad几个，当然可以是多维的情况。

>>> a = [1, 2, 3, 4, 5]
>>> np.lib.pad(a, (2,3), 'constant', constant_values=(4, 6))
array([4, 4, 1, 2, 3, 4, 5, 6, 6, 6])

>>> np.lib.pad(a, (2, 3), 'edge')
array([1, 1, 1, 2, 3, 4, 5, 5, 5, 5])

>>> np.lib.pad(a, (2, 3), 'linear_ramp', end_values=(5, -4))
array([ 5,  3,  1,  2,  3,  4,  5,  2, -1, -4])

>>> np.lib.pad(a, (2,), 'maximum')
array([5, 5, 1, 2, 3, 4, 5, 5, 5])

>>> np.lib.pad(a, (2,), 'mean')
array([3, 3, 1, 2, 3, 4, 5, 3, 3])

>>> np.lib.pad(a, (2,), 'median')
array([3, 3, 1, 2, 3, 4, 5, 3, 3])

>>> a = [[1, 2], [3, 4]]
>>> np.lib.pad(a, ((3, 2), (2, 3)), 'minimum')
array([[1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [3, 3, 3, 4, 3, 3, 3],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1]])

>>> a = [1, 2, 3, 4, 5]
>>> np.lib.pad(a, (2, 3), 'reflect')
array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])

>>> np.lib.pad(a, (2, 3), 'reflect', reflect_type='odd')
array([-1,  0,  1,  2,  3,  4,  5,  6,  7,  8])

>>> np.lib.pad(a, (2, 3), 'symmetric')
array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])

>>> np.lib.pad(a, (2, 3), 'symmetric', reflect_type='odd')
array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])

>>> np.lib.pad(a, (2, 3), 'wrap')
array([4, 5, 1, 2, 3, 4, 5, 1, 2, 3])

>>> def padwithtens(vector, pad_width, iaxis, kwargs):
...     vector[:pad_width[0]] = 10
...     vector[-pad_width[1]:] = 10
...     return vector

>>> a = np.arange(6)
>>> a = a.reshape((2, 3))

>>> np.lib.pad(a, 2, padwithtens)
array([[10, 10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10, 10],
       [10, 10,  0,  1,  2, 10, 10],
       [10, 10,  3,  4,  5, 10, 10],
       [10, 10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10, 10]])

最后可以生成这样的图片，cool,

这一部分真的很有收获！！！以前学习卷积，以为内部实现也就那么回事，自己动手写起来才会明白里面有很多的有意思的细节需要考虑。

这里就记录一个之前没有理解的numpy细节

b = np.random.randn(2,2,3,8)
b[:,:,:,0] 代表选择conv2d的第一个立方体

PART 3

学习keras。keras框架其实很简单，封装的很好，缺点就是在数据加载和内部网络结构的灵活度上不能像tensorflow那样什么都能做，但是我们如果是要实现一个小demo或是利用一些预训好的网络做融合还是很快速的。

主要步骤：

Create the model by calling the function above
Compile the model by calling model.compile(optimizer = "...", loss = "...", metrics = ["accuracy"])
Train the model on train data by calling model.fit(x = ..., y = ..., epochs = ..., batch_size = ...)
Test the model on test data by calling model.evaluate(x = ..., y = ...)

构造自定义的网络很简单：

def HappyModel(input_shape):
"""
Implementation of the HappyModel.

Arguments:
input_shape -- shape of the images of the dataset

Returns:
model -- a Model() instance in Keras
"""

### START CODE HERE ###
# Feel free to use the suggested outline in the text above to get started, and run through the whole
# exercise (including the later portions of this notebook) once. The come back also try out other
# network architectures as well. 

X_input = Input(input_shape)

X = ZeroPadding2D((3, 3))(X_input)

X = Conv2D(32, (7, 7), strides = (1, 1), name = 'conv0')(X)
X = BatchNormalization(axis = 3, name = 'bn0')(X)
X = Activation('relu')(X)

X = Conv2D(32, (3, 3), strides = (1, 1), name = 'conv1')(X)
X = BatchNormalization(axis = 3, name = 'bn1')(X)
X = Activation('relu')(X)

X = MaxPooling2D((2, 2), name='max_pool')(X)

X = Flatten()(X)
X = Dense(1, activation='sigmoid', name='fc')(X)

model = Model(inputs = X_input, outputs = X, name='HappyModel')


### END CODE HERE ###

return model

PART 4

resetNet。

这是个很有意思的东西，技术层面讲应该叫highway,shortcut。把当前层的数据跳跃几层传到后面去，造成一个残差的效果。
学了恩达的作业才知道有两种shortcut的方式，一种是你直接传递，所以中间那几层做的输出的shape不能变，这个样子网络就会变成一个吃了东西的蛇或是一个向内缩的瓶子。另一种办法就是向后跳的中间过一个卷积层，这个卷积层会把你的shape变成后面的样子。

residule

课程作业列表

tensorflow进阶笔记 --- #"4"# deeplearning.ai practice

tensorflow in practice

PART 1

Convolution model - Step by Step - v1

PART 2

Convolution model - Step by Step - v2

PART 3

PART 4

residule

jasperyang

引用和评论

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码卷积

基于yolov5实现的AI智能盒子框架

vLLM 实战教程汇总，从环境配置到大模型部署，中文文档追踪重磅更新

性能远超SAM系模型，苏黎世大学等开发通用3D血管分割基础模型

18个常用的强化学习算法整理：从基础方法到高级模型的理论技术与代码实现

【vLLM 学习】基础教程

【Triton 教程】triton.heuristics

tensorflow进阶笔记 --- #"4"# deeplearning.ai practice

tensorflow in practice

PART 1

Convolution model - Step by Step - v1

PART 2

Convolution model - Step by Step - v2

PART 3

PART 4

residule

jasperyang

引用和评论

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码 卷积

基于yolov5实现的AI智能盒子框架

vLLM 实战教程汇总，从环境配置到大模型部署，中文文档追踪重磅更新

性能远超SAM系模型，苏黎世大学等开发通用3D血管分割基础模型

18个常用的强化学习算法整理：从基础方法到高级模型的理论技术与代码实现

【vLLM 学习】基础教程

【Triton 教程】triton.heuristics

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码卷积