使用不同优化器可以得到不同的效果,差别在于收敛速度不同
#梯度下降函数,优化器就会按照循环的次数一次次沿着loss最小值的方向优化参数了。
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
#优化器
# train_step = tf.train.AdamOptimizer(0.001)
#收敛速度比较快
# train_step = tf.train.AdamOptimizer(1e-3).minimize(loss)? #10的-3次方,学习率,
使用两种优化器,收敛结果分别如下
Iter 0,Testing Accuracy0.6166
Iter 1,Testing Accuracy0.7754
Iter 2,Testing Accuracy0.7965
Iter 3,Testing Accuracy0.8053
Iter 4,Testing Accuracy0.8119
Iter 5,Testing Accuracy0.816
Iter 6,Testing Accuracy0.8193
Iter 7,Testing Accuracy0.8216
Iter 8,Testing Accuracy0.8235
Iter 9,Testing Accuracy0.825
Iter 0,Testing Accuracy0.7521
Iter 1,Testing Accuracy0.8384
Iter 2,Testing Accuracy0.8709
Iter 3,Testing Accuracy0.8852
Iter 4,Testing Accuracy0.8927
Iter 5,Testing Accuracy0.8982
Iter 6,Testing Accuracy0.9015
Iter 7,Testing Accuracy0.9035
Iter 8,Testing Accuracy0.9069
Iter 9,Testing Accuracy0.9088
由此可见,对于此种模型,使用AdamOptimizer优化器收敛的速度比梯度下降法的速度快一些