0%

weekly-report-20201030

本周学习报告

  1. 这一周主要学习了吴恩达的深度学习课程,他的课程主要是讲了深度学习的一些基本原理和概念。下边的目录是我这一周所看的内容。
  2. 另外本周还在Tensorflow上做了一个mnist手写数字的识别练习,代码和网络参数都在后文。

DeepLearning

吴承恩的深度学习课程

第一门课 神经网络和深度学习(Neural Networks and Deep Learning)

第一周:深度学习引言(Introduction to Deep Learning)

1.1 欢迎(Welcome) 1

1.2 什么是神经网络?(What is a Neural Network)

1.3 神经网络的监督学习(Supervised Learning with Neural Networks)

1.4 为什么神经网络会流行?(Why is Deep Learning taking off?)

第二周:神经网络的编程基础(Basics of Neural Network programming)

2.1 二分类(Binary Classification)

2.2 逻辑回归(Logistic Regression)

2.3 逻辑回归的代价函数(Logistic Regression Cost Function)

2.4 梯度下降(Gradient Descent)

2.5 导数(Derivatives)

2.6 更多的导数例子(More Derivative Examples)

2.7 计算图(Computation Graph)

2.8 计算图导数(Derivatives with a Computation Graph)

2.9 逻辑回归的梯度下降(Logistic Regression Gradient Descent)

2.10 梯度下降的例子(Gradient Descent on m Examples)

2.11 向量化(Vectorization)

2.12 更多的向量化例子(More Examples of Vectorization)

2.13 向量化逻辑回归(Vectorizing Logistic Regression)

2.14 向量化逻辑回归的梯度计算(Vectorizing Logistic Regression’s Gradient)

2.15 Python中的广播机制(Broadcasting in Python)

2.16 关于 Python与numpy向量的使用(A note on python or numpy vectors)

2.17 Jupyter/iPython Notebooks快速入门(Quick tour of Jupyter/iPython Notebooks)

2.18 逻辑回归损失函数详解(Explanation of logistic regression cost function)

第三周:浅层神经网络(Shallow neural networks)

3.1 神经网络概述(Neural Network Overview)

3.2 神经网络的表示(Neural Network Representation)

3.3 计算一个神经网络的输出(Computing a Neural Network’s output)

3.4 多样本向量化(Vectorizing across multiple examples)

3.5 向量化实现的解释(Justification for vectorized implementation)

3.6 激活函数(Activation functions)

3.7 为什么需要非线性激活函数?(why need a nonlinear activation function?)

3.8 激活函数的导数(Derivatives of activation functions)

3.9 神经网络的梯度下降(Gradient descent for neural networks)

3.10(选修)直观理解反向传播(Backpropagation intuition)

3.11 随机初始化(Random+Initialization)

第四周:深层神经网络(Deep Neural Networks)

4.1 深层神经网络(Deep L-layer neural network)

4.2 前向传播和反向传播(Forward and backward propagation)

4.3 深层网络中的前向和反向传播(Forward propagation in a Deep Network)

4.4 核对矩阵的维数(Getting your matrix dimensions right)

4.5 为什么使用深层表示?(Why deep representations?)

4.6 搭建神经网络块(Building blocks of deep neural networks)

4.7 参数VS超参数(Parameters vs Hyperparameters)

4.8 深度学习和大脑的关联性(What does this have to do with the brain?)

第二门课 改善深层神经网络:超参数调试、正则化以及优化(Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)

第一周:深度学习的实用层面(Practical aspects of Deep Learning)

1.1 训练,验证,测试集(Train / Dev / Test sets)

1.2 偏差,方差(Bias /Variance)

1.3 机器学习基础(Basic Recipe for Machine Learning)

1.4 正则化(Regularization)

1.5 为什么正则化有利于预防过拟合呢?(Why regularization reduces overfitting?)

1.6 dropout 正则化(Dropout Regularization)

1.7 理解 dropout(Understanding Dropout)

1.8 其他正则化方法(Other regularization methods)

1.9 标准化输入(Normalizing inputs)

1.10 梯度消失/梯度爆炸(Vanishing / Exploding gradients)

1.11 神经网络的权重初始化(Weight Initialization for Deep NetworksVanishing /Exploding gradients)

1.12 梯度的数值逼近(Numerical approximation of gradients)

1.13 梯度检验(Gradient checking)

1.14 梯度检验应用的注意事项(Gradient Checking Implementation Notes)

第二周:优化算法 (Optimization algorithms)

2.1 Mini-batch 梯度下降(Mini-batch gradient descent)

2.2 理解Mini-batch 梯度下降(Understanding Mini-batch gradient descent)

2.3 指数加权平均(Exponentially weighted averages)

2.4 理解指数加权平均(Understanding Exponentially weighted averages)

2.5 指数加权平均的偏差修正(Bias correction in exponentially weighted averages)

2.6 momentum梯度下降(Gradient descent with momentum)

2.7 RMSprop——root mean square prop(RMSprop)

2.8 Adam优化算法(Adam optimization algorithm)

2.9 学习率衰减(Learning rate decay)

2.10 局部最优问题(The problem of local optima)

第三周超参数调试,batch正则化和程序框架(Hyperparameter tuning, Batch Normalization and Programming Frameworks)

3.1 调试处理(Tuning process)

3.2 为超参数选择和适合范围(Using an appropriate scale to pick hyperparameters)

3.3 超参数训练的实践:Pandas vs. Caviar(Hyperparameters tuning in practice: Pandas vs. Caviar)

3.4 网络中的正则化激活函数(Normalizing activations in a network)

3.5 将 Batch Norm拟合进神经网络(Fitting Batch Norm into a neural network)

3.6 为什么Batch Norm奏效?(Why does Batch Norm work?)

3.7 测试时的Batch Norm(Batch Norm at test time)

3.8 Softmax 回归(Softmax Regression)

3.9 训练一个Softmax 分类器(Training a softmax classifier)

3.10 深度学习框架(Deep learning frameworks)

3.11 TensorFlow(TensorFlow)

Tensorflow实战

非线性回归模型

mnist手写数字识别

网络结构

隐藏层数: 3

隐藏层大小: 200 200 200(3个隐藏层的神经元数目都为200)

droup: 0.7 0.7 0.7(3个隐藏层的dropout参数均为0.7)

隐藏层激活函数: tanh tanh tanh(隐藏层均用tanh作为激活函数)

输出层大小: 10

输出层激活函数: softmax

代价函数: 交叉熵代价函数

梯度下降法: mini_batch梯度下降法

batch_size: 64

实现结果:在99次迭代训练后,在训练集上识别精度为0.9885,在测试集上识别精度为0.9775

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import tensorflow as tf
import numpy as np

from tensorflow.examples.tutorials.mnist import input_data

#载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

#每个批次的大小
batch_size = 64
#计算一共有多少个批次
n_batch = mnist.train.num_examples // batch_size


#定义placeholder,x为输入,y为label
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])

#定义占位符,dropout的比例,神经元工作的比例
keep_prob = tf.placeholder(tf.float32)

#神经网络参数
L1_size = 200
L2_size = 200
L3_size = 200

#定义神经网络
W1 = tf.Variable(tf.truncated_normal([784,L1_size],stddev = 0.1))
b1 = tf.Variable(tf.zeros([1,L1_size]) + 0.01)
z1 = tf.matmul(x,W1) + b1
a1 = tf.nn.tanh(z1)
L1_drop = tf.nn.dropout(a1,keep_prob)

W2 = tf.Variable(tf.truncated_normal([L1_size,L2_size],stddev=0.1))
b2 = tf.Variable(tf.zeros([1,L2_size]) + 0.01)
z2 = tf.matmul(L1_drop,W2) + b2
a2 = tf.nn.tanh(z2)
L2_drop = tf.nn.dropout(a2,keep_prob)


W3 = tf.Variable(tf.truncated_normal([L2_size,L3_size],stddev=0.1))
b3 = tf.Variable(tf.zeros([1,L3_size]) + 0.01)
z3 = tf.matmul(L2_drop,W3) + b3
a3 = tf.nn.tanh(z3)
L3_drop = tf.nn.dropout(a3,keep_prob)

Wout = tf.Variable(tf.truncated_normal([L3_size,10],stddev=0.1))
bout = tf.Variable(tf.zeros([1,10]) + 0.01)
zout = tf.matmul(L3_drop,Wout) + bout
prediction = tf.nn.softmax(zout)

#代价函数
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y,logits = prediction))

#梯度下降法的优化器
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

#变量初试化
init = tf.global_variables_initializer()

#求准确度
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
sess.run(init)

for epoch in range(100):
for batch in range(n_batch):
batch_xs,batch_ys = mnist.train.next_batch(batch_size)
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:0.7})




test_acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
train_acc = sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels,keep_prob:1.0})
print("Iter" + str(epoch) + ",Testing Accuracy:" + str(test_acc) + " | Trainung Accuracy:" + str(train_acc))

Paper

本周看论文的任务没有完成。

Verilog

仿真常用verilog example的任务没有完成。