DNC

发表于 2019-05-21 | 更新于 2019-05-22 | 分类于机器学习

摘要

引言

现代计算机将memory和computation分开，使用处理器进行计算，处理器使用可访问的memory存取数。这样子的好处是可以使用extensible storage写入新信息，可以将memory中的内容当做variables。Variables对于算法的通用性很有用，对于不同的数据，不需要更改算法操作的地址，只需要更改变量的取值即可。而neural network的computation和memory是通过network的weights和neuron activity耦合在一起的。如果memory需要增加的话，networks不能动态增加新的storage，也不能独立的学习network的参数。

这篇文章中作者提出了differentiable neural computer(DNC)–带可读写external memory的network，解决network不能表示variable和数据结构的问题。整个system是可导的，可以把DNC的memory看做RAM，把network看做CPU。
DNC有一个$N\times W$大小的memory matrix $M$，使用可导的attention mechanism，确定在这个memory上的distributions，也就是我们说的weighting（加权）,代表相应的操作在该位置上的权重。DNC提供了三种操作，查询，读和写，对应了三种不同的attention，使用三个head（头）,read head（读头），write head（写头）,lookup head（查找）实现对memory的相应操作。

参考文献

1.https://www.gwern.net/docs/rl/2016-graves.pdf
2.http://people.idsia.ch/~rupesh/rnnsymposium2016/slides/graves.pdf
3.https://deepmind.com/blog/differentiable-neural-computers/

tensorflow contrib vs layers vs nn

发表于 2019-05-18 | 更新于 2019-07-25 | 分类于 tensorflow

tf.contrib

根据tensorflow官网的说法，tf.contrib模块中包含了易修改的测试代码，

contrib module containing volatile or experimental code.

当其中的某一个模块完成的时候，就会从contrib模块中移除。为了保持对历史版本的兼容性，可能这几个模块会存在同一个函数的不同实现。

tf.nn,tf.layers和tf.contrib

tf.nn中是low-level的op
tf.layers是high-level的op
而tf.contrib中的是非正式版本的实现，在后续版本中可能会被弃用。

tf.nn.conv2d vs tf.layers.conv2d

API

tf.layer.conv2d

tf.layers.conv2d(
    inputs, 
    filters, 
    kernel_size, 
    strides=(1, 1), 
    padding='valid', 
    data_format='channels_last', 
    dilation_rate=(1, 1), 
    activation=None, 
    use_bias=True, 
    kernel_initializer=None, 
    bias_initializer=tf.zeros_initializer(), 
    kernel_regularizer=None, 
    bias_regularizer=None, 
    activity_regularizer=None, 
    trainable=True, 
    name=None, 
    reuse=None
)

tf.nn.conv2d

tf.nn.conv2d(
    input, 
    filter, 
    strides, 
    padding, 
    use_cudnn_on_gpu=None, 
    data_format=None, 
    name=None
)

nn.conv2d vs layers.conv2d

tf.nn.conv2d需要手动创建filter的tensor，传入filter的参数[kernel_height, kernel_width, in_channels, num_filters]。
tf.layer.conv2d需要传入filter的维度即可。

对于tf.nn.conv2d，
filter:和input的type一样，是一个4D的tensor，shape为[filter_height, filter_width, in_channels, out_channels]
对于tf.layers.conv2d，
filters:是整数，是需要多少个filters。

可以使用tf.nn.conv2d来加载一个pretrained model，使用tf.layers.conv2d从头开始训练一个model。

用法

tf.layers.conv2d

# Convolution Layer with 32 filters and a kernel size of 5
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu) 
# Max Pooling (down-sampling) with strides of 2 and kernel size of 2
conv1 = tf.layers.max_pooling2d(conv1, 2, 2)

tf.nn.conv2d

strides = 1
# Weights matrix looks like: [kernel_size(=5), kernel_size(=5), input_channels (=3), filters (= 32)]
# Similarly bias = looks like [filters (=32)]
out = tf.nn.conv2d(input, weights, padding="SAME", strides = [1, strides, strides, 1])
out = tf.nn.bias_add(out, bias)
out = tf.nn.relu(out)

参考文献

1.https://www.tensorflow.org/api_docs/python/tf/contrib
2.https://stackoverflow.com/questions/48001759/what-is-right-batch-normalization-function-in-tensorflow
3.https://stackoverflow.com/a/48003210
4.https://stackoverflow.com/questions/42785026/tf-nn-conv2d-vs-tf-layers-conv2d
5.https://stackoverflow.com/a/53683545
6.https://stackoverflow.com/a/45308609

tensorflow rnn

发表于 2019-05-18 | 更新于 2019-05-19 | 分类于 tensorflow

常见Cell和函数

tf.nn.rnn_cell.BasicRNNCell: 最基本的RNN cell.
tf.nn.rnn_cell.LSTMCell: LSTM cell
tf.nn.rnn_cell.LSTMStateTuple: tupled LSTM cell
tf.nn.rnn_cell.MultiRNNCell: 多层Cell
tf.nn.rnn_cell.DropoutCellWrapper: 给Cell加上dropout
tf.nn.dynamic_rnn: 动态rnn
tf.nn.static_rnn: 静态rnn

BasicRNNCell

API

__init__(
    num_units,
    activation=None,
    reuse=None,
    name=None,
    dtype=None,
    **kwargs
)

示例

完整代码地址

   myrnn = rnn.BasicRNNCell(rnn_size,activation=tf.nn.relu)
   zero_state = myrnn.zero_state(batch_size, dtype=tf.float32)
   outputs, states = rnn.static_rnn(myrnn, x, initial_state=zero_state, dtype=tf.float32)
return outputs

其他

TF 2.0将会弃用，等价于tf.keras.layers.SimpleRNNCell()

LSTMCell

API

__init__(
    num_units, # 隐藏层的大小
    use_peepholes=False, # 
    cell_clip=None,
    initializer=None, # 权重的初始化构造器
    num_proj=None,
    proj_clip=None,
    num_unit_shards=None,
    num_proj_shards=None,
    forget_bias=1.0,
    state_is_tuple=True, # c_state和m_state的元组
    activation=None,
    reuse=None,
    name=None,
    dtype=None,
    **kwargs
)

示例

完整代码地址

lstm = rnn.BasicLSTMCell(lstm_size, forget_bias=1, state_is_tuple=True)
   zero_state = lstm.zero_state(batch_size, dtype=tf.float32)
   outputs, states = rnn.static_rnn(lstm, x, initial_state=zero_state, dtype=tf.float32)
return outputs

其他

TF 2.0将会弃用，等价于tf.keras.layers.LSTMCell

LSTMStateTuple

和LSTMCell一样，只不过state用的是tuple。

其他

TF 2.0将会弃用，等价于tf.keras.layers.LSTMCell

MultiRNNCell

这个类可以实现多层RNN。

API

__init__(
    cells,
    state_is_tuple=True
)

示例

代码1

num_units = [128, 64]
cells = [BasicLSTMCell(num_units=n) for n in num_units]
stacked_rnn_cell = MultiRNNCell(cells)
outputs, state = tf.nn.dynamic_rnn(cell=stacked_rnn_cell,
                                   inputs=data,
                                   dtype=tf.float32)

代码2

完整代码地址

   lstm_cell = rnn.BasicLSTMCell(lstm_size, forget_bias=1, state_is_tuple=True)
   cell = rnn.MultiRNNCell([lstm_cell]*layers, state_is_tuple=True)
   state = cell.zero_state(batch_size, dtype=tf.float32)
   outputs = []
   with tf.variable_scope("Multi_Layer_RNN", reuse=reuse):
       for time_step in range(time_steps):
           if time_step > 0:
               tf.get_variable_scope().reuse_variables()
           
           cell_outputs, state = cell(x[time_step], state)
           outputs.append(cell_outputs)
return outputs

其他

TF 2.0将会弃用，等价于tf.keras.layers.StackedRNNCells

DropoutCellWrapper

API

__init__(
    cell, # 
    input_keep_prob=1.0,
    output_keep_prob=1.0,
    state_keep_prob=1.0,
    variational_recurrent=False,
    input_size=None,
    dtype=None,
    seed=None,
    dropout_state_filter_visitor=None
)

示例

完整代码地址

   lstm_cell = rnn.BasicLSTMCell(lstm_size, forget_bias=1, state_is_tuple=True)
   lstm_cell = rnn.DropoutWrapper(lstm_cell, output_keep_prob=0.9)
   cell = rnn.MultiRNNCell([lstm_cell]*layers, state_is_tuple=True)
   state = cell.zero_state(batch_size, dtype=tf.float32)
   outputs = []
   with tf.variable_scope("Multi_Layer_RNN"):
       for time_step in range(time_steps):
           if time_step > 0:
               tf.get_variable_scope().reuse_variables()
           cell_outputs, state = cell(x[time_step], state)
           outputs.append(cell_outputs)
return outputs

其他

static_rnn

API

tf.nn.static_rnn(
    cell, # RNNCell的具体对象
    inputs, # 输入，长度为T的输入列表，列表中每一个Tensor的shape都是[batch_size, input_size]
    initial_state=None, # rnn的初始状态，如果cell.state_size是整数，它的shape需要是[batch_size, cell.state_size]，如果cell.state_size是元组，那么终究会是一个tensors的元组，[batch_size, s] for s in cell.state_size
    dtype=None, # 
    sequence_length=None, # 
    scope=None
)
# 最简单形式的RNN，就是该API的参数都是用默认值，给定cell和inputs，相当于做了以下操作：
#    state = cell.zero_state(...)
#    outputs = []
#    for input_ in inputs:
#      output, state = cell(input_, state)
#      outputs.append(output)
#    return (outputs, state)

示例

1
2
3

myrnn = tf.nn.rnn_cell.BasicRNNCell(rnn_size,activation=tf.nn.relu)
   zero_state = myrnn.zero_state(batch_size, dtype=tf.float32)
   outputs, states = tf.nn.static_rnn(myrnn, x, initial_state=zero_state, dtype=tf.float32)

dynamic rnn

API

tf.nn.dynamic_rnn(
    cell, # RNNCell的具体对象
    inputs, # RNN的输入,time_major = False, [batch_size, max_time, ...],time_major=True, [max_time, batch_size, ...]
    sequence_length=None, # 
    initial_state=None, # rnn的初始状态，如果cell.state_size是整数，它的shape需要是[batch_size, cell.state_size]，如果cell.state_size是元组，那么就会是一个tensors的元组，[batch_size, s] for s in cell.state_size
    dtype=None,
    parallel_iterations=None,
    swap_memory=False, #
    time_major=False, # 如果为True,如果为False，对应不同的inputs 
    scope=None
)

示例

# 例子1.创建一个BasicRNNCell
rnn_cell = tf.nn.rnn_cell.BasicRNNCell(hidden_size)

# 定义初始化状态
initial_state = rnn_cell.zero_state(batch_size, dtype=tf.float32)

# 'outputs' shape [batch_size, max_time, cell_state_size]
# 'state' shape [batch_size, cell_state_size]
outputs, state = tf.nn.dynamic_rnn(rnn_cell, input_data,
                                   initial_state=initial_state,
                                   dtype=tf.float32)

# 例子2.创建两个LSTMCells
rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [128, 256]]

# 创建一个多层RNNCelss。
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)

# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
                                   inputs=data,
                                   dtype=tf.float32)

static_rnn vs dynamic_rnn

tf.keras.layers.RNN(cell)

在tensorflow 2.0中，上述两个API都会被弃用，使用新的keras.layers.RNN(cell)

tf.nn.rnn_cell

该模块提供了许多RNN cell类和rnn函数。

类

class BasicRNNCell: 最基本的RNN cell.
class BasicLSTMCell: 弃用了，使用tf.nn.rnn_cell.LSTMCell代替，就是下面那个
class LSTMCell: LSTM cell
class LSTMStateTuple: tupled LSTM cell
class GRUCell: GRU cell (引用文献 http://arxiv.org/abs/1406.1078).
class RNNCell: 表示一个RNN cell的抽象对象
class MultiRNNCell: 由很多个简单cells顺序组合成的RNN cell
class DeviceWrapper: 保证一个RNNCell在一个特定的device运行的op.
class DropoutWrapper: 添加droput到给定cell的的inputs和outputs的op.
class ResidualWrapper: 确保cell的输入被添加到输出的RNNCell warpper。

函数

static_rnn(…) # 未来将被弃用，和tf.contrib.rnn.static_rnn是一样的。
dynamic_rnn(…) # 未来将被弃用
static_bidirectional_rnn(…) # 未来将被弃用
bidirectional_dynamic_rnn(…) # 未来将被弃用
raw_rnn(…)

tf.contrib.rnn

该模块提供了RNN和Attention RNN的类和函数op。

类

class RNNCell: # 抽象类，所有Cell都要继承该类。所有的Warpper都要直接继承该Cell。
class LayerRNNCell: # 所有的下列定义的Cell都要使用继承该Cell，该Cell继承RNNCell，所以所有下列Cell都间接继承RNNCell。
class BasicRNNCell:
class BasicLSTMCell: # 将被弃用，使用下面的LSTMCell。
class LSTMCell:
class LSTMStateTuple:
class GRUCell:
class MultiRNNCell:
class ConvLSTMCell:
class GLSTMCell:
class Conv1DLSTMCell:
class Conv2DLSTMCell:
class Conv3DLSTMCell:
class BidirectionalGridLSTMCell:
class AttentionCellWrapper:
class CompiledWrapper:
class CoupledInputForgetGateLSTMCell:
class DeviceWrapper:
class DropoutWrapper:
class EmbeddingWrapper:
class FusedRNNCell:
class FusedRNNCellAdaptor:
class GRUBlockCell:
class GRUBlockCellV2:
class GridLSTMCell:
class HighwayWrapper:
class IndRNNCell:
class IndyGRUCell:
class IndyLSTMCell:
class InputProjectionWrapper:
class IntersectionRNNCell:
class LSTMBlockCell:
class LSTMBlockFusedCell:
class LSTMBlockWrapper:
class LayerNormBasicLSTMCell:
class NASCell:
class OutputProjectionWrapper:
class PhasedLSTMCell:
class ResidualWrapper:
class SRUCell:
class TimeFreqLSTMCell:
class TimeReversedFusedRNN:
class UGRNNCell:

函数

static_rnn(…) # 将被弃用，和tf.nn.static_rnn是一样的
static_bidirectional_rnn(…) # 将被弃用
best_effort_input_batch_size(…)
stack_bidirectional_dynamic_rnn(…)
stack_bidirectional_rnn(…)
static_state_saving_rnn(…)
transpose_batch_time(…)

tf.contrib.rnn vs tf.nn.rnn_cell

事实上，这两个模块中都定义了许多RNN cell，contrib定义的是测试性的代码，而nn.rnn_cell是contrib中经过测试后的代码。
contrib中的代码会经常修改，而nn中的代码比较稳定。
contrib中的cell类型比较多，而nn中的比较少。
contrib和nn中有重复的cell，基本上nn中有的contrib中都有。

参考文献

tensorflow layers

发表于 2019-05-18 | 更新于 2019-05-19 | 分类于 tensorflow

tf.layers

这个模块定义在tf.contrib.layers中。主要是构建神经网络，正则化和summaries等op。它包括1个模块，19个类，以及一系列函数。

模块

experimental module

tf.layers.experimental的公开的API

类

class Conv2D

二维卷积类。

API

__init__(
    filters, # 卷积核的数量
    kernel_size, # 卷积核的大小
    strides=(1, 1),
    padding='valid',
    data_format='channels_last', # string, "channels_last", "channels_first"
    dilation_rate=(1, 1), #
    activation=None, # 激活函数
    use_bias=True,
    kernel_initializer=None, # 卷积核的构造器
    bias_initializer=tf.zeros_initializer(), # bias的构造器
    kernel_regularizer=None, #  卷积核的正则化
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True, # 如果为True的话，将变量添加到TRANABLE_VARIABELS collection中
    name=None,
    **kwargs
)

示例

其他

所有类

class AveragePooling1D
class AveragePooling2D
class AveragePooling3D
class BatchNormalization
class Conv1D
class Conv2D
class Conv2DTranspose
class Conv3D
class Conv3DTranspose
class Dense
class Dropout
class Flatten
class InputSpec
class Layer
class MaxPooling1D
class MaxPooling2D
class MaxPooling3D
class SeparableConv1D
class SeparableConv2D

函数

conv2d

API

tf.layers.conv2d(
    inputs, # 输入
    filters, #  一个整数,输出的维度，就是有几个卷积核
    kernel_size,
    strides=(1, 1),
    padding='valid',
    data_format='channels_last',
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True,
    name=None,
    reuse=None
)

示例

其他

所有函数

需要注意的是，下列所有函数在以后版本都将被弃用。

average_pooling1d(…)
average_pooling2d(…)
average_pooling3d(…)
batch_normalization(…)
conv1d(…)
conv2d(…)
conv2d_transpose(…)
conv3d(…)
conv3d_transpose(…)
dense(…)
dropout(…)
flatten(…)
max_pooling1d(…)
max_pooling2d(…)
max_pooling3d(…)
separable_conv1d(…)
separable_conv2d(…)

tf.layers.conv2d vs tf.layers.Conv2d

tf.layers.Conv2d.__init__(
    filters,
    kernel_size,
    strides=(1, 1),
    padding='valid',
    data_format='channels_last',
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True,
    name=None,
    **kwargs
)
tf.layers.conv2d(
    inputs,
    filters,
    kernel_size,
    strides=(1, 1),
    padding='valid',
    data_format='channels_last',
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True,
    name=None,
    reuse=None
)

conv2d是函数；Conv2d是类。
conv2d运行的时候需要传入卷积核参数，输入；Conv2d在构造的时候需要实例化卷积核参数，实例化后，可以使用不用的输入得到不同的输出。
调用conv2d就相当于调用Conv2d对象的apply(inputs)函数。

参考文献

1.https://www.tensorflow.org/api_docs/python/tf/layers
4.https://www.tensorflow.org/api_docs/python/tf/layers/Conv2D
5.https://www.tensorflow.org/api_docs/python/tf/layers/conv2d
6.https://stackoverflow.com/questions/52011509/what-is-difference-between-tf-layers-conv2d-and-tf-layers-conv2d/52035621

tensorflow nn module

发表于 2019-05-18 | 更新于 2019-05-19 | 分类于 tensorflow

tf.nn

提供神经网络op。包含构建RNN cell的rnn_cell模块和一些函数。

tf.nn.rnn_cell

rnn_cell 用于构建RNN cells
包括以下几个类：

class BasicLSTMCell: 弃用了，使用tf.nn.rnn_cell.LSTMCell代替。
class BasicRNNCell: 最基本的RNN cell.
class DeviceWrapper: 保证一个RNNCell在一个特定的device运行的op.
class DropoutWrapper: 添加droput到给定cell的的inputs和outputs的op.
class GRUCell: GRU cell (引用文献 http://arxiv.org/abs/1406.1078).
class LSTMCell: LSTM cell
class LSTMStateTuple: tupled LSTM cell
class MultiRNNCell: 由很多个简单cells顺序组合成的RNN cell
class RNNCell: 表示一个RNN cell的抽象对象
class ResidualWrapper: 确保cell的输入被添加到输出的RNNCell warpper。

函数

conv2d(…)

给定一个4d输入和filter，计算2d卷积。

API

tf.nn.conv2d(
    input, # 输入，[batch, in_height, in_width, in_channels]
    filter, # 4d tensor, [filter_height, filter_width, in_channels, out_channles]
    strides, # 长度为4的1d tensor。
    padding, # string, 可选"SAME"或者"VALID"
    use_cudnn_on_gpu=True, #
    data_format='NHWC', #
    dilations=[1, 1, 1, 1], #
    name=None
)

示例

def conv2d(inputs, output_dim, kernel_size, stride, initializer, activation_fn,
           padding='VALID', data_format='NHWC', name="conv2d", reuse=False):
    kernel_shape = None
    with tf.variable_scope(name, reuse=reuse):
        if data_format == 'NCHW':
            stride = [1, 1, stride[0], stride[1]]
            kernel_shape = [kernel_size[0], kernel_size[1], inputs.get_shape()[1], output_dim]
        elif data_format == 'NHWC':
            stride = [1, stride[0], stride[1], 1]
            kernel_shape = [kernel_size[0], kernel_size[1], inputs.get_shape()[-1], output_dim ]

        w = tf.get_variable('w', kernel_shape, tf.float32, initializer=initializer)
        conv = tf.nn.conv2d(inputs, w, stride, padding, data_format=data_format)

        b = tf.get_variable('b', [output_dim], tf.float32, initializer=tf.constant_initializer(0.0))
        out = tf.nn.bias_add(conv, b, data_format=data_format)

    if activation_fn is not None:
        out = activation_fn(out)
    return out, w, b

convolution

API

tf.nn.convolution(
    input, # 输入
    filter, # 卷积核
    padding, # string, 可选"SAME"或者"VALID"
    strides=None, # 步长
    dilation_rate=None,
    name=None,
    data_format=None
)

和tf.nn.conv2d对比

tf.nn.conv2d是2d卷积
tf.nn.convolution是nd卷积

conv2d_transpose

反卷积

API

tf.nn.conv2d_transpose(
    value, # 输入，4d tensor，[batch, in_channels, height, width] for NCHW,或者[batch,height, width, in_channels] for NHWC
    filter, # 4d卷积核，shape是[height, width, output_channels, in_channels]
    output_shape, # 表示反卷积输出的shape一维tensor
    strides, # 步长
    padding='SAME',
    data_format='NHWC',
    name=None
)

示例

max_pool

实现max pooling

API

tf.nn.max_pool(
    value, # 输入，4d tensor
    ksize, # 4个整数的list或者tuple，max pooling的kernel size
    strides, # 4个整数的list或者tuple
    padding, # string, 可选"VALID"或者"VALID"
    data_format='NHWC', # string,可选"NHWC", "NCHW", NCHW_VECT_C"
    name=None
)

几个常用的函数

bias_add(…)
raw_rnn(…)
static_rnn(…) # 未来将被弃用
dynamic_rnn(…) # 未来将被弃用
static_bidirectional_rnn(…) # 未来将被弃用
bidirectional_dynamic_rnn(…) # 未来将被弃用
dropout(…)
leaky_relu(…)
l2_loss(…)
log_softmax(…) # 参数弃用
softmax(…) # 参数弃用
softmax_cross_entropy_with_logits(…) # 未来将被弃用
softmax_cross_entropy_with_logits_v2(…) # 参数弃用
sparse_softmax_cross_entropy_with_logits(…)

全部函数

all_candidate_sampler(…)
atrous_conv2d(…)
atrous_conv2d_transpose(…)
avg_pool(…)
avg_pool3d(…)
batch_norm_with_global_normalization(…)
batch_normalization(…)
bias_add(…)
bidirectional_dynamic_rnn(…)
collapse_repeated(…)
compute_accidental_hits(…)
conv1d(…)
conv2d(…)
conv2d_backprop_filter(…)
conv2d_backprop_input(…)
conv2d_transpose(…)
conv3d(…)
conv3d_backprop_filter(…)
conv3d_backprop_filter_v2(…)
conv3d_transpose(…)
convolution(…) - crelu(…)
ctc_beam_search_decoder(…)
ctc_beam_search_decoder_v2(…)
ctc_greedy_decoder(…)
ctc_loss(…)
ctc_loss_v2(…)
ctc_unique_labels(…)
depth_to_space(…)
depthwise_conv2d(…)
depthwise_conv2d_backprop_filter(…)
depthwise_conv2d_backprop_input(…)
depthwise_conv2d_native(…)
depthwise_conv2d_native_backprop_filter(…)
depthwise_conv2d_native_backprop_input(…)
dilation2d(…)
dropout(…)
dynamic_rnn(…)
elu(…)
embedding_lookup(…)
embedding_lookup_sparse(…)
erosion2d(…)
fixed_unigram_candidate_sampler(…)
fractional_avg_pool(…)
fractional_max_pool(…)
fused_batch_norm(…)
in_top_k(…)
l2_loss(…)
l2_normalize(…)
leaky_relu(…)
learned_unigram_candidate_sampler(…)
local_response_normalization(…)
log_poisson_loss(…)
log_softmax(…)
log_uniform_candidate_sampler(…)
lrn(…)
max_pool(…)
max_pool3d(…)
max_pool_with_argmax(…)
moments(…)
nce_loss(…)
normalize_moments(…)
pool(…)
quantized_avg_pool(…)
quantized_conv2d(…)
quantized_max_pool(…)
quantized_relu_x(…)
raw_rnn(…)
relu(…)
relu6(…)
relu_layer(…)
safe_embedding_lookup_sparse(…)
sampled_softmax_loss(…)
selu(…)
separable_conv2d(…)
sigmoid(…)
sigmoid_cross_entropy_with_logits(…)
softmax(…)
softmax_cross_entropy_with_logits(…)
softmax_cross_entropy_with_logits_v2(…)
softplus(…)
softsign(…)
space_to_batch(…)
space_to_depth(…)
sparse_softmax_cross_entropy_with_logits(…)
static_bidirectional_rnn(…)
static_rnn(…)
static_state_saving_rnn(…)
sufficient_statistics(…)
tanh(…)
top_k(…)
uniform_candidate_sampler(…)
weighted_cross_entropy_with_logits(…)
weighted_moments(…)
with_space_to_batch(…)
xw_plus_b(…)
zero_fraction(…)

参考文献

1.https://www.tensorflow.org/api_docs/python/tf/nn
2.https://www.tensorflow.org/api_docs/python/tf/nn/rnn_cell
3.https://www.tensorflow.org/api_docs/python/tf/nn/conv2d
4.https://stackoverflow.com/questions/38601452/what-is-tf-nn-max-pools-ksize-parameter-used-for
5.https://www.tensorflow.org/api_docs/python/tf/nn/convolution
6.https://stackoverflow.com/questions/47775244/difference-between-tf-nn-convolution-and-tf-nn-conv2d
7.https://www.tensorflow.org/api_docs/python/tf/nn/conv2d_transpose

tensorflow softmax

发表于 2019-05-16 | 更新于 2019-09-28 | 分类于 tensorflow

各种softmax

tf.nn.softmax。
tf.nn.log_softmax。
tf.nn.softmax_cross_entropy_with_logits_v2中label是用稀疏的（one-hot）表示的。
tf.nn.sparse_softmax_cross_entropy_with_logits中label是非稀疏的。

对比

tf.nn.softmax()
tf.nn.log_softmax()
tf.nn.softmax_cross_entropy_with_logits_v2()
tf.nn.sparse_cross_entropy_with_logits()

logits

什么是logits

数学上

假设一个事件发生的概率为 p，那么该事件的logits为$\text{logit}§ = \log\frac{p}{1-p}$.

Machine Learning中

深度学习中的logits和数学上的logits没有太大联系。logits在机器学习中前向传播的输出，是未归一化的概率，总和不为$1$。将logits的输出输入softmax函数之后可以得到归一化的概率。

tf.nn.softmax

API

tf.nn.softmax(
	logits,
	axis=None,
	name=None
)

功能

上面函数实现了如下的功能：
softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
就是将输入的logits经过softmax做归一化。

示例

import tensorflow as tf

logits = [1.0, 1.0, 2.0, 2.0, 2.0, 2.0]
res_op = tf.nn.softmax(logits)
sess = tf.Session()
result = sess.run(res_op)
print(result)
print(sum(result))
# output
# [0.07768121 0.07768121 0.21115941 0.21115941 0.21115941 0.21115941]
# 1.0000000447034836
# 因为有指数运算，所以就不是整数

tf.nn.log_softmax

API

tf.nn.log_softmax(
	logits,
	axis=None,
	name=None
)

功能

该函数实现了如下功能。
logsoftmax = logits - log(reduce_sum(exp(logits), axis))

示例

import tensorflow as tf

logits = [1.0, 1.0, 2.0, 2.0, 2.0, 2.0]
res_op = tf.nn.log_softmax(logits)
sess = tf.Session()
result = sess.run(res_op)
print(result)
print(sum(result))

# output
# [-2.555142  -2.555142  -1.5551419 -1.5551419 -1.5551419 -1.5551419]
# -11.330851554870605

tf.nn.softmax_cross_entropy_with_logits_v2

API

tf.nn.softmax_cross_entropy_with_logits_v2(
    labels, # shape是[batch_size, num_calsses]，每一个labels[i]都应该是一个有效的probability distribution
    logits, # 没有normalized的log probabilities
    axis=None,
    name=None
    dim=-1,
)

功能

计算logits经过softmax之后和labels之间的交叉熵

tf.sparse_softmax_cross_entropy_with_logits

API

tf.nn.sparse_softmax_cross_entropy_with_logits(
    _sentinel=None,  # pylint: disable=invalid-name
    labels=None,    # shape是[d_0, d_1, ..., d_{r-1}]其中r是labels的秩，type是int32或int64，每一个entry都应该在[0, num_classes)之间
    logits=None,    # logits 是[d_0, d_1, ..., d_{r-1}, num_classes]，是float类型的，可以看成unnormalized log probabilities
    name=None
)

功能

计算logits和labels之间的稀疏softmax交叉熵

参考文献

1.http://landcareweb.com/questions/789/shi-yao-shi-logits-softmaxhe-softmax-cross-entropy-with-logits
2.https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow
3.https://stackoverflow.com/a/43577384
4.https://stackoverflow.com/a/47852892
5.https://www.tensorflow.org/tutorials/estimators/cnn
6.https://www.zhihu.com/question/60751553

tensorflow collection

发表于 2019-05-13 | 更新于 2019-09-28 | 分类于 tensorflow

tf.collection

Tensorflow用graph collection来管理不同类型的对象。tf.GraphKeys中定义了默认的collection，tf通过调用各种各样的collection操作graph中的变量。比如tf.Optimizer只优化tf.GraphKeys.TRAINABLE_VARIABLES collection中的变量。常见的collection如下，它们其实都是字符串：

GLOBAL_VARIABLES: 所有的Variable对象在创建的时候自动加入该colllection，且在分布式环境中共享（model variables是它的子集）。一般来说，TRAINABLE_VARIABLES包含在MODEL_VARIABLES中，MODEL_VARIABLES包含在GLOBAL_VARIABLES中。也就是说TRAINABLE_VARIABLES$\le$MODEL_VARIABLES$\le$GLOBAL_VARIABLES。一般tf.train.Saver()对应的是GLOBAL_VARIABLES的变量。
LOCAL_VARIABLES: 它是GLOBAL_VARIABLES不同的是在本机器上的Variable子集。使用tf.contrib.framework.local_variable将变量添加到这个collection.
MODEL_VARIABLES: 模型变量，在构建模型中，所有用于前向传播的Variable都将添加到这里。使用 tf.contrib.framework.model_variable向这个collection添加变量。
TRAINALBEL_VARIABLES: 所有用于反向传播的Variable，可以被optimizer训练，进行参数更新的变量。tf.Variable对象同样会自动加入这个collection。
SUMMARIES: graph创建的所有summary Tensor都会记录在这里面。
QUEUE_RUNNERS:
MOVING_AVERAGE_VARIABLES: 保持Movering average的变量子集。
REGULARIZATION_LOSSES: 创建graph的regularization loss。

这里主要介绍三类collection，一种是GLOBAL_VARIABLES，一种是SUMMARIES，一种是自定义的collections。

下面的一些collection也被定义了，但是并不会自动添加

The following standard keys are defined, but their collections are not automatically populated as many of the others are:

WEIGHTS
BIASES
ACTIVATIONS

GLOBAL_Variable collection

tf.Variable()对象在生成时会被默认添加到tf.GraphKeys中的GLOBAL_VARIABLES和TRAINABLE_VARIABLES collection中。

代码示例

代码地址

import tensorflow as tf

a = tf.Variable([1, 2, 3])
b = tf.get_variable("bbb", shape=[2,3])
tf.constant([3])
c = tf.ones([3])
d = tf.random_uniform([3, 4])
e = tf.log(c)

# 查看GLOBAL_VARIABLES collection中的变量
global_variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
for var in global_variables:
   print(var)

# 查看TRAINABLE_VARIABLES collection中的变量
trainable_variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
for var in global_variables:
   print(var)

Summary collection

Summary op产生的变量会被添加到tf.GraphKeys.SUMMARIES collection中。
点击查看关于tf.summary的详细介绍

代码示例

代码地址

import tensorflow as tf

# 生成一个图
graph = tf.Graph()

with graph.as_default():
    # 指定模型参数
    w = tf.Variable([0.3], name="w", dtype=tf.float32)
    b = tf.Variable([0.2], name="b", dtype=tf.float32)

    # 输入数据placeholder
    x = tf.placeholder(tf.float32, name="inputs")
    y = tf.placeholder(tf.float32, name="outputs")

    # 前向传播
    with tf.name_scope('linear_model'):
        linear = w * x + b

	# 计算loss
    with tf.name_scope('cal_loss'):
        loss = tf.reduce_mean(input_tensor=tf.square(y - linear), name='loss')

	# 定义summary saclar op
    with tf.name_scope('add_summary'):
        summary_loss = tf.summary.scalar('MSE', loss)
        summary_b = tf.summary.scalar('b', b[0])

	# 定义优化器
    with tf.name_scope('train_model'):
        optimizer = tf.train.GradientDescentOptimizer(0.01)
        train = optimizer.minimize(loss)

with tf.Session(graph=graph) as sess:
	inputs = [1, 2, 3, 4]
	outputs = [2, 3, 4, 5]
    # 定义写入文件类
    writer = tf.summary.FileWriter("./summary/", graph)
    # 获取所有的summary op，不用一个一个去单独run
    merged = tf.summary.merge_all()

	# 初始化
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    for i in range(5000):
		# 运行summary op merged
        _, summ = sess.run([train, merged], feed_dict={x: inputs, y: outputs})
		# 将summary op返回的变量转化为事件，写入文件
        writer.add_summary(summ, global_step=i)

    w_, b_, l_ = sess.run([w, b, loss], feed_dict={x: inputs, y: outputs})
    print("w: ", w_, "b: ", b_, "loss: ", l_)

    # 查看SUMMARIES collection
    for var in tf.get_collection(tf.GraphKeys.SUMMARIES):
        print(var)

自定义collection

通过tf.add_collection()和tf.get_collection()可以添加和访问custom collection。

示例代码

代码地址

import tensorflow as tf

# 定义第1个loss
x1 = tf.constant(1.0)
l1 = tf.nn.l2_loss(x1)

# 定义第2个loss
x2 = tf.constant([2.5, -0.3])
l2 = tf.nn.l2_loss(x2)

# 将loss添加到losses collection中
tf.add_to_collection("losses", l1)
tf.add_to_collection("losses", l2)

# 查看losses collection中的内容
losses = tf.get_collection('losses')
for var in tf.get_collection('losses'):
    print(var)

# 建立session运行
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    losses_val = sess.run(losses)
    print(losses_val)

疑问

collection是和graph绑定在一起的，那么如果定义了很多个图，如何获得非默认图的tf.GraphKeys中定义的collection？？

参考文献

1.https://blog.csdn.net/shenxiaolu1984/article/details/52815641
2.https://blog.csdn.net/hustqb/article/details/80398934
3.https://www.tensorflow.org/api_docs/python/tf/GraphKeys?hl=zh_cn

tensorflow graph和session

发表于 2019-05-12 | 更新于 2019-07-18 | 分类于 tensorflow

tf.Graph和tf.Session

Graph和Session之间的区别和联系。

Graph定义了如何进行计算，但是并没有进行计算，graph不会hold任何值，它仅仅定义code中指定的各种operation
Session用来执行graph或者graph的一部分。它会分配资源（一个机器或者多个机器），并且会保存中间结果和variables的值。在不同session的执行过程也是分开的。

tf.Graph

tf.Graph包含两类信息：

Node和Edge，用来表示各个op如何进行组合。
collections。使用tf.add_to_collection和tf.get_collection对collection进行操作。一个常见的例子是创建tf.Variable的时候，默认会将它加入到"global variables"和"trainable variables" collection中。
当调用tf.train.Saver和tf.train.Optimizer的时候，它会使用这些collection中的变量作为默认参数。
常见的定义在tf.GraphKeys上的collection:
VARIABLES, TRAINABLE_VARIABLES, MOVING_AVERAGE_VARIABLES, LOCAL_VARIABLES, MODEL_VARIABLE,SUMMARIES.
关于collections的详细介绍可点击这里

构建tf.Graph

调用tensorflow API就会构建新的tf.Operation和tf.Tensor，并将他们添加到tf.Graph实例中去。

调用 tf.constant(42.0) 创建单个 tf.Operation，该操作可以生成值 42.0，将该值添加到默认图中，并返回表示常量值的 tf.Tensor。
调用 tf.matmul(x, y) 可创建单个 tf.Operation，该操作会将 tf.Tensor 对象 x 和 y 的值相乘，将其添加到默认图中，并返回表示乘法运算结果的 tf.Tensor。
执行 v = tf.Variable(0) 可向图添加一个 tf.Operation，该操作可以存储一个可写入的张量值，该值在多个 tf.Session.run 调用之间保持恒定。tf.Variable 对象会封装此操作，并可以像张量一样使用，即读取已存储值的当前值。tf.Variable 对象也具有 assign 和 assign_add 等方法，这些方法可创建 tf.Operation 对象，这些对象在执行时将更新已存储的值。（请参阅变量了解关于变量的更多信息。）
调用 tf.train.Optimizer.minimize 可将操作和张量添加到计算梯度的默认图中，并返回一个 tf.Operation，该操作在运行时会将这些梯度应用到一组变量上。

获得默认图

用 tf.get_default_graph，它会返回一个 tf.Graph 对象：

1 2	# Print all of the operations in the default graph. g = tf.get_default_graph()

清空默认图

tf.reset_default_graph()

1 2	# 清空当前session的默认图 tf.reset_default_graph()

命名空间

tf.Graph 对象会定义一个命名空间（为其包含的 tf.Operation 对象）。TensorFlow 会自动为图中的每个指令选择一个唯一名称，也可以指定描述性名称，让程序阅读和调试起来更轻松。TensorFlow API 提供两种方法来指定op名称：

如果API会创建新的op或返回新的 tf.Tensor，就可选 name 参数。例如，tf.constant(42.0, name=“answer”) 会创建一个新的 tf.Operation（名为 “answer”）并返回一个 tf.Tensor（名为 “answer:0”）。如果默认图已包含名为 “answer” 的操作，则 TensorFlow 会在名称上附加 “_1”、"_2" 等字符，以便让名称具有唯一性。
借助 tf.name_scope 函数，可以向在特定上下文中创建的所有op添加name_scope。当前name_scope是一个用 “/” 分隔的名称列表，其中包含所有活跃的 tf.name_scope 上下文管理器名称。如果某个name_scope已在当前上下文中被占用，TensorFlow 将在该作用域上附加 “_1”、"_2" 等字符。例如：

c_0 = tf.constant(0, name="c")  # => operation named "c"

# Already-used names will be "uniquified".
c_1 = tf.constant(2, name="c")  # => operation named "c_1"

# Name scopes add a prefix to all operations created in the same context.
with tf.name_scope("outer"):
  c_2 = tf.constant(2, name="c")  # => operation named "outer/c"

  # Name scopes nest like paths in a hierarchical file system.
  with tf.name_scope("inner"):
    c_3 = tf.constant(3, name="c")  # => operation named "outer/inner/c"

  # Exiting a name scope context will return to the previous prefix.
  c_4 = tf.constant(4, name="c")  # => operation named "outer/c_1"

  # Already-used name scopes will be "uniquified".
  with tf.name_scope("inner"):
    c_5 = tf.constant(5, name="c")  # => operation named "outer/inner_1/c"

请注意，tf.Tensor 对象以输出张量的op明确命名。张量名称的形式为 “<OP_NAME>:<i>”，其中：

“<OP_NAME>” 是生成该张量的操作的名称。
“<i>” 是一个整数，表示该张量在该op的输出中的索引。

获得图中的op

import tensorflow as tf

c_0 = tf.constant(0, name="c")  # => operation named "c"
# Already-used names will be "uniquified".  c_1 = tf.constant(2, name="c")  # => operation named "c_1"

# Name scopes add a prefix to all operations created in the same context.
with tf.name_scope("outer"):
  c_2 = tf.constant(2, name="c")  # => operation named "outer/c"

  # Name scopes nest like paths in a hierarchical file system.
  with tf.name_scope("inner"):
    c_3 = tf.constant(3, name="c")  # => operation named "outer/inner/c"

g = tf.get_default_graph()
print(g.get_operations())
# [<tf.Operation 'c' type=Const>, <tf.Operation 'c_1' type=Const>, <tf.Operation 'outer/c' type=Const>, <tf.Operation 'outer/inner/c' type=Const>]

类张量对象

许多 TensorFlow op都会接受一个或多个 tf.Tensor 对象作为参数。例如，tf.matmul 接受两个 tf.Tensor 对象，tf.add_n 接受一个具有 n 个 tf.Tensor 对象的列表。为了方便起见，这些函数将接受类张量对象来取代 tf.Tensor，并将它明确转换为 tf.Tensor（通过 tf.convert_to_tensor 方法）。类张量对象包括以下类型的元素：

tf.Tensor
tf.Variable
numpy.ndarray
list（以及类似于张量的对象的列表）
标量 Python 类型：bool、float、int、str

注意默认情况下，每次使用同一个类张量对象时，TensorFlow 将创建新的 tf.Tensor。如果类张量对象很大（例如包含一组训练样本的 numpy.ndarray），且多次使用该对象，则可能会耗尽内存。要避免出现此问题，请在类张量对象上手动调用 tf.convert_to_tensor 一次，并使用返回的 tf.Tensor。

tf.Session

API

tf.Session.init(
		target, # 可选参数，指定设备。
		graph, #可选参数，默认情况下，新的session绑定到默认graph
		confi # 可选参数，常见的一个选择为gpu_options.allow_growth。将此参数设置为 True 可更改 GPU 内存分配器，使该分配器逐渐增加分配的内存量，而不是在启动时分配掉大多数内存。
)

创建session

默认session

1
2
3

# Create a default in-process session.
with tf.Session() as sess:
  # ...

执行op

tf.Session.run 方法是运行 tf.Operation 或评估 tf.Tensor 的主要机制。传入一个或多个 tf.Operation 或 tf.Tensor 对象到 tf.Session.run，TensorFlow 将执行计算结果所需的操作。
tf.Session.run 需要指定一组 fetch，这些 fetch 可确定返回值，并且可能是 tf.Operation、tf.Tensor 或类张量类型，例如 tf.Variable。这些 fetch 决定了必须执行哪些子图（属于整体 tf.Graph）以生成结果：该子图包含 fetch 列表中指定的所有op，以及其输出用于计算 fetch 值的所有操作。例如，以下代码段说明了 tf.Session.run 的不同参数如何导致执行不同的子图：

x = tf.constant([[37.0, -23.0], [1.0, 4.0]])
w = tf.Variable(tf.random_uniform([2, 2]))
y = tf.matmul(x, w)
output = tf.nn.softmax(y)
init_op = w.initializer

with tf.Session() as sess:
  # 初始化w
  sess.run(init_op)

  # Evaluate `output`. `sess.run(output)` will return a NumPy array containing
  # the result of the computation.
  # 计算output
  print(sess.run(output))

  # Evaluate `y` and `output`. Note that `y` will only be computed once, and its
  # result used both to return `y_val` and as an input to the `tf.nn.softmax()`
  # op. Both `y_val` and `output_val` will be NumPy arrays.
  # 计算y和output
  y_val, output_val = sess.run([y, output])

tf.Session.run 也可以接受 feed dict，该字典是从 tf.Tensor 对象（通常是 tf.placeholder 张量），在执行时会替换这些张量的值（通常是 Python 标量、列表或 NumPy 数组）的映射。例如：

# Define a placeholder that expects a vector of three floating-point values,
# and a computation that depends on it.
x = tf.placeholder(tf.float32, shape=[3])
y = tf.square(x)

with tf.Session() as sess:
  # Feeding a value changes the result that is returned when you evaluate `y`.
  print(sess.run(y, {x: [1.0, 2.0, 3.0]}))  # => "[1.0, 4.0, 9.0]"
  print(sess.run(y, {x: [0.0, 0.0, 5.0]}))  # => "[0.0, 0.0, 25.0]"

  # Raises <a href="../api_docs/python/tf/errors/InvalidArgumentError"><code>tf.errors.InvalidArgumentError</code></a>, because you must feed a value for
  # a `tf.placeholder()` when evaluating a tensor that depends on it.
  sess.run(y)

  # Raises `ValueError`, because the shape of `37.0` does not match the shape
  # of placeholder `x`.
  sess.run(y, {x: 37.0})

tf.Session.run 也接受可选的 options 参数（允许指定与调用有关的选项）和可选的 run_metadata 参数（允许收集与执行有关的元数据）。例如，可以同时使用这些选项来收集与执行有关的跟踪信息：

y = tf.matmul([[37.0, -23.0], [1.0, 4.0]], tf.random_uniform([2, 2]))

with tf.Session() as sess:
  # Define options for the `sess.run()` call.
  options = tf.RunOptions()
  options.output_partition_graphs = True
  options.trace_level = tf.RunOptions.FULL_TRACE

  # Define a container for the returned metadata.
  metadata = tf.RunMetadata()

  sess.run(y, options=options, run_metadata=metadata)

  # Print the subgraphs that executed on each device.
  print(metadata.partition_graphs)

  # Print the timings of each operation that executed.
  print(metadata.step_stats)

不同session的结果

代码地址

import tensorflow as tf

graph = tf.Graph()

with graph.as_default():
    variable = tf.Variable(10, name="foo")
    initialize = tf.global_variables_initializer()
    assign = variable.assign(12)

with tf.Session(graph=graph) as sess:
    sess.run(initialize)
    sess.run(assign)
    print(sess.run(variable))

with tf.Session(graph=graph) as sess:
    print(sess.run(variable))

访问当前sess的图。

1 2	sess = tf.Session() sess.graph

可视化图

使用图可视化工具。最简单的方法是传递tf.Graph到tf.summary.FileWriter中。如下示例：

# Build your graph.
x = tf.constant([[37.0, -23.0], [1.0, 4.0]])
w = tf.Variable(tf.random_uniform([2, 2]))
y = tf.matmul(x, w)
# ...
loss = ...
train_op = tf.train.AdagradOptimizer(0.01).minimize(loss)

with tf.Session() as sess:
  # `sess.graph` provides access to the graph used in a <a href="../api_docs/python/tf/Session"><code>tf.Session</code></a>.
  writer = tf.summary.FileWriter("/tmp/log/...", sess.graph)

  # Perform your computation...
  for i in range(1000):
    sess.run(train_op)
    # ...

  writer.close()

然后可以在 tensorboard 中打开日志并转到“图”标签，查看图结构的概要可视化图表。

创建多个图

TensorFlow 提供了一个“默认图”，此图明确传递给同一上下文中的所有 API 函数。TensorFlow 提供了操作默认图的方法，在更高级的用例中，这些方法可能有用。

tf.Graph 会定义 tf.Operation 对象的命名空间：单个图中的每个操作必须具有唯一名称。如果请求的名称已被占用，TensorFlow 将在操作名称上附加 “_1”、"_2" 等字符，以便确保名称的唯一性。通过使用多个明确创建的图，可以更有效地控制为每个op指定什么样的名称。
默认图会存储与添加的每个 tf.Operation 和 tf.Tensor 有关的信息。如果程序创建了大量未连接的子图，更有效的做法是使用另一个 tf.Graph 构建每个子图，以便回收不相关的状态。

创建两个图

g_1 = tf.Graph()
with g_1.as_default():
  # Operations created in this scope will be added to `g_1`.
  c = tf.constant("Node in g_1")

  # Sessions created in this scope will run operations from `g_1`.
  sess_1 = tf.Session()

g_2 = tf.Graph()
with g_2.as_default():
  # Operations created in this scope will be added to `g_2`.
  d = tf.constant("Node in g_2")

# Alternatively, you can pass a graph when constructing a <a href="../api_docs/python/tf/Session"><code>tf.Session</code></a>:
# `sess_2` will run operations from `g_2`.
sess_2 = tf.Session(graph=g_2)

assert c.graph is g_1
assert sess_1.graph is g_1

assert d.graph is g_2
assert sess_2.graph is g_2

参考文献

1.https://www.tensorflow.org/guide/graphs?hl=zh_cn
2.https://blog.csdn.net/shenxiaolu1984/article/details/52815641
3.https://danijar.com/what-is-a-tensorflow-session/

tensorflow Varaible

发表于 2019-05-12 | 更新于 2019-07-09 | 分类于 tensorflow

创建Variable

Tensorflow有两种方式创建Variable：tf.Variable()和tf.get_variable()，这两种方式获得的都是tensorflow.python.ops.variables.Variable类型的对象，但是他们的输入参数还有些不一样。

	tf.Variable()	tf.get_variable()
name	不需要，已存在的变量名，会在后面加上递增的数值用来区分	必须，已存在的会报错
shape	不需要，或者说已经包含在初值中了	需要
初值	必须	不需要
复用	不可以	可以

两种方法事实上都可以指定name和初值。而tf.Variable()的初值中已经包含了shape，所以不需要再显示传入shape了。这里的需要和不需要指的是必要不必要，如果没有传入需要的参数，就会报错，不需要的参数则不会影响。

tf.Variable()

一句话介绍

创建一个类操作全局变量。在TensorFlow内部，tf.Variable会存储持久性张量，允许各种op读取和修改它的值。这些修改在多个Session之间是可见的，因此对于一个tf.Variable，多个工作器可以看到相同的值。

和tf.Tensor对比

tf.Variable 表示可通过对其运行op来改变其值的张量。与 tf.Tensor对象不同，tf.Variable 存在于单个session.run调用的上下文之外。tf.Tensor的值是不可以改变的，tf.Tensor没有assign函数。

API

tf.Variable.__init__(
	initial_value=None,  # 指定变量的初值
	trainable=True,  # 是否在BP时训练该参数
	collections=None, # 指定变量的collection
	validate_shape=True, 
	caching_device=None, 
	name=None,  # 指定变量的名字
	...
)

代码示例

tensor1 = tf.Variable([[1,2], [3,5]])
tensor2 = tf.Variable(tf.constant([[1,2], [3,5]]))
sess.run(tf.global_variables_initializer())
sess.run(tensor1)
sess.run(tensor2)

初始化

tf.Variable()生成的变量必须初始化，tf.constant()可以不用初始化。

使用全局初始化
sess.run(tf.global_variables_initializer())
使用checkpoint
使用tf.assign赋值

tf.get_variable()

一句话介绍

获取一个已经存在的变量或者创建一个新的变量。主要目的，变量复用。

API

tf.get_variable(
    name, # 指定变量的名字，必选项
    shape=None, # 指定变量的shape，可选项
    dtype=None, # 指定变量类型
    initializer=None, # 指定变量初始化器
    regularizer=None,
    trainable=None,
    collections=None,
    caching_device=None,
    partitioner=None,
    validate_shape=True,
    use_resource=None,
    custom_getter=None,
    constraint=None,
    synchronization=tf.VariableSynchronization.AUTO,
    aggregation=tf.VariableAggregation.NONE
)

代码示例

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
  scope.reuse_variables()
  output2 = my_image_filter(input2)

Variable和collection

点击查看关于collecion的详细介绍
默认情况下，每个tf.Variable()都会添加到以下两个collection中：

tf.GraphKeys.GLOBAL_VARIABLES - 可以在多台设备间共享的变量，
tf.GraphKeys.TRAINABLE_VARIABLES - TensorFlow 将计算其梯度的变量。

如果不希望变量是可训练的，可以在创建时指定其collection为 tf.GraphKeys.LOCAL_VARIABLES collection中。

1	my_local = tf.get_variable("my_local", shape=(), collections=[tf.GraphKeys.LOCAL_VARIABLES])

或者可以指定 trainable=False：

1
2
3

my_non_trainable = tf.get_variable("my_non_trainable",
                                   shape=(),
                                   trainable=False)

获取collection

要检索放在某个collection中的所有变量的列表，可以使用：

代码示例

代码地址

import tensorflow as tf

a = tf.Variable([1, 2, 3])
b = tf.get_variable("bbb", shape=[2,3])
tf.constant([3])
c = tf.ones([3])
d = tf.random_uniform([3, 4])
print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))
# [<tf.Variable 'Variable:0' shape=(3,) dtype=int32_ref>, <tf.Variable 'bbb:0' shape=(2, 3) dtype=float32_ref>]
# 可以看出来，只有tf.Variable()和tf.get_variable()产生的变量会加入到这个图中

自定义collection

添加自定义collection

可以使用自定义的collection。collection名称可为任何字符串，且无需显式创建。创建对象（包括Variable和其他）后调用 tf.add_to_collection将其添加到相应collection中。以下代码将 my_local 变量添加到名为 my_collection_name 的collection中：

1	tf.add_to_collection("my_collection_name", my_local)

初始化变量

初始化所有变量

调用 tf.global_variables_initializer()在训练开始前一次性初始化所有可训练变量。此函数会返回一个op，负责初始化 tf.GraphKeys.GLOBAL_VARIABLES collection中的所有变量。运行此op会初始化所有变量。

1	sess.run(tf.global_variables_initializer())

初始化单个变量

运行变量的初始化器op。

1	sess.run(my_variable.initializer)

查询未初始化变量

1	print(sess.run(tf.report_uninitialized_variables()))

共享变量

TensorFlow 支持两种共享变量的方式：

显式传递 tf.Variable 对象。
将 tf.Variable 对象隐式封装在 tf.variable_scope 对象内。

variable_scope

代码示例1

使用variable_scope区分weights和biases。

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)

代码示例2

使用variable_scope声明不同作用域

def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

共享方式1

设置reuse=True

with tf.variable_scope("model"):
  output1 = my_image_filter(input1)
with tf.variable_scope("model", reuse=True):
  output2 = my_image_filter(input2)

共享方式2

调用scope.reuse_variables触发重用

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
  scope.reuse_variables()
  output2 = my_image_filter(input2)

参考文献

1.https://blog.csdn.net/MrR1ght/article/details/81228087
2.https://www.tensorflow.org/guide/variables?hl=zh_cn
3.https://www.tensorflow.org/api_docs/python/tf/get_variable?hl=zh_cn

tensorflow list of placeholder

发表于 2019-05-12 | 分类于 tensorflow

list of placeholder

目的

计算图中定义了一个placeholder list，如何使用feed_dict传入值。

代码示例

代码地址

import tensorflow as tf
import numpy as np

# 创建一个长度为n的placeholder list
n = 4
ph_list = [tf.placeholder(tf.float32, [None, 10]) for _ in range(4)]
# 对这个ph list的操作
result = tf.Variable(0.0)
for x in ph_list:
    result = tf.add(result, x)
hhhh = tf.log(result)


if __name__ == "__main__":
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())

    # 生成数据
    inputs = []
    for _ in range(n):
        x = np.random.rand(16, 10)
        inputs.append(x)
    # 声明一个字典，存放placeholder和value键值对
    feed_dictionary = {}
    for k,v in zip(ph_list, inputs):
       feed_dictionary[k] = v
    # feed 数据
    print(sess.run(hhhh, feed_dict=feed_dictionary).shape)

参考文献

1.https://stackoverflow.com/questions/51128427/how-to-feed-list-of-values-to-a-placeholder-list-in-tensorflow

马晓鑫爱马荟荟

记录硕士三年自己的积累

RSS

GitHub E-Mail