tensorflow中名称作用域和变量作用域

名称作用域和变量作用域能让我们在tensorboard中更清晰得看到各类变量所处的命名空间以及输入、输出的流动方向。当神经网络架构较为复杂时，可以通过为网络不同块的变量和运算创建作用域的形式来使graph更整洁。(源码为官方代码，注释为翻译版，英文版请查看源码doc或者关注官方网站)

源码

Init signature:
tf.variable_scope(
    name_or_scope, # 传入<str>或者scope
    default_name=None, # 若name_or_scope==None，则自动生成具有唯一标示的名称；
    values=None, # 需要用于op的Tensor列表
    initializer=None, # 该scope中变量默认初始化方式
    regularizer=None, # 该scope中变量默认正则化方式
    caching_device=None, # 该scope中变量默认缓存设备
    partitioner=None, # 该scope中变量的默认分区
    custom_getter=None, # 不知道干啥的
    reuse=None, # 是否重用，True,None or tf.AUTO_REUSE
    dtype=None, # 变量类型(默认为传入的变量类型或者从父scope继承)
    use_resource=None, # False: 所有变量均为均为常规变量。\\ True: 具有良定义语义学的experimental ResourceVariables
    constraint=None, # 在优化器更新变量之后，使用一个投影函数对输出进行变换。
    auxiliary_name_scope=True, # True:创建一个辅助name scope;该变量不具有继承属性，\\ 只有第一次创建时才有作用；所以应在重新进入已经创建好的scope时才使用。
)
# 函数返回一个可以被capture和reuse的socpe。

函数注释:

一个定义vars或layers生成ops的上下文管理器。

这一上下文管理器会确认（可选）values是否来自于同一graph，确保graph是默认graph，并且会push名称作用域(name_scpoe)和变量作用域(variable_scope)

如果name_or_scope不为None,则使用as is.如果scope为None,则使用default_name。此时，如果已经使用过相同的名字，则在其后添加_N表示第N个具有相同名称前缀的作用域或变量来使其具有唯一性。

在variable_scope中可以创建新的变量，也可以共享已经创建的变量，并且可以检查是否存在意外创建或共享的情况。

创建一个新的变量：

with tf.compat.v1.variable_scope("foo"):
    with tf.compat.v1.variable_scope("bar"):
        v = tf.compat.v1.get_variable("v", [1])
        assert v.name == "foo/bar/v:0"

安全进入一个已经创建好的variable scope：

with tf.compat.v1.variable_scope("foo") as vs:
  pass

# 重新进入变量作用域
with tf.compat.v1.variable_scope(vs, auxiliary_name_scope=False) as vs1:
  # 重建原始的名称作用域
  with tf.name_scope(vs1.original_name_scope):
      v = tf.compat.v1.get_variable("v", [1])
      assert v.name == "foo/v:0"
      c = tf.constant([1], name="c")
      assert c.name == "foo/c:0"
>>> print(v)
<tf.Variable 'foo/v:0' shape=(1,) dtype=float32_ref>
>>> print(type(v))
<class 'tensorflow.python.ops.variables.RefVariable'>
>>> print(c)
Tensor("foo/c:0", shape=(1,), dtype=int32)
>>> print(type(c))
<class 'tensorflow.python.framework.ops.Tensor'>

通过 AUTO_REUSE共享变量：

def foo():
  with tf.compat.v1.variable_scope("foo",
  reuse=tf.compat.v1.AUTO_REUSE):
    v = tf.compat.v1.get_variable("v", [1])
  return v

v1 = foo()  # 创建变量 v.
v2 = foo()  # 获取创建的 v，即v1.
assert v1 == v2
assert id(v1) == id(v2) 
# 相当于v1和v2均指向相同的内存地址

通过 reuse=True共享变量:

with tf.compat.v1.variable_scope("foo"):
    v = tf.compat.v1.get_variable("v", [1])
with tf.compat.v1.variable_scope("foo", reuse=True):
    v1 = tf.compat.v1.get_variable("v", [1])
assert v1 == v
assert id(v1) == id(v)

在当前scope内设置reuse:

# 注意：必须要把scope.reuse_variables()放在已创建的变量之后以及重用之前。
with tf.compat.v1.variable_scope("foo") as scope:
    v = tf.compat.v1.get_variable("v", [1])
    scope.reuse_variables()
    v1 = tf.compat.v1.get_variable("v", [1])
assert v1 == v

为了避免变量的意外共享，当并未在scope中设置reuse而重复定义相同命名的变量时会抛出ValueError：

with tf.compat.v1.variable_scope("foo"):
    v = tf.compat.v1.get_variable("v", [1])
    v1 = tf.compat.v1.get_variable("v", [1])
    #  Raises ValueError("... v already exists ...").

同样地，当reuse为True时，如果第一次定义变量也会抛出VauleError(即没有可以用来重用的变量)：

1
2
3

with tf.compat.v1.variable_scope("foo", reuse=True):
    v = tf.compat.v1.get_variable("v", [1])
    #  Raises ValueError("... v does not exists ...").

reuse的继承属性：当打开一个reusing作用域时，其中的所有子作用域也将是resuing的。

多线程环境中的变量作用域说明：variable scope是线程的局部变量，所以一个线程的scope对于另一线程而言是不可见的。另外，当使用default_name，唯一的scope名称也只是相对于每一个线程而言的，如果多个线程使用了相同的名字，一个新的线程仍然能够创建相同的scope

以下示例中，变量是被线程共享的(同一graph中)。严格来讲，只有当resue为True时，线程在创建其它线程已经创建过的同名变量时才会成功。

另外，每个线程的起始scope为空，所以如果想在其它线程中保留主线程scope中变量的名称前缀，需要capture主线程的scope并在其它线程中重新进入这一scope，例：

main_thread_scope = variable_scope.get_variable_scope()

# Thread's target function:
def thread_target_fn(captured_scope):
  with variable_scope.variable_scope(captured_scope):
    # .... regular code for this thread
    
thread = threading.Thread(target=thread_target_fn, args=(main_thread_scope,))

实例

variable_scope

scope1 = "Test1"
with tf.variable_scope(scope1):
    testA = tf.get_variable("testA1", [3, 256, 256])
    testB = tf.get_variable("testA2", [3, 256, 256])
    testC = tf.multiply(testA, testB, name = "Mul-testA12")
    testD = tf.Variable([1],name="testD")
# testA: <tf.Variable 'Test1/testA1:0' shape=(3, 256, 256) dtype=float32_ref>
# testB: <tf.Variable 'Test1/testA2:0' shape=(3, 256, 256) dtype=float32_ref>
# testC: Tensor("Test1/Mul-testA12:0", shape=(3, 256, 256), dtype=float32)
# testD: <tf.Variable 'Test1/testD:0' shape=(1,) dtype=int32_ref>

可以看到variable_scope能够同时影响到tf.get_variable以及op的命名，也就是说，变量域的作用范围内命名的变量，其前缀为变量域名。

name_scope

scope1 = "Test1"
with tf.name_scope(scope1):
    testA = tf.get_variable("testA1", [3, 256, 256])
    testB = tf.get_variable("testA2", [3, 256, 256])
    testC = tf.multiply(testA, testB, name = "Mul-testA12")
    testD = tf.Variable([1],name="testD")
# testA: <tf.Variable 'testA1:0' shape=(3, 256, 256) dtype=float32_ref>
# testB: <tf.Variable 'testA2:0' shape=(3, 256, 256) dtype=float32_ref>
# testC: Tensor("Test1/Mul-testAC:0", shape=(3, 256, 256), dtype=float32)
# testD: <tf.Variable 'Test1/testD:0' shape=(1,) dtype=int32_ref>

而在name_scope中，其不会影响到由tf.get_variable得到的变量，只会作用于op运算得到的变量以及由tf.Variable得到的变量。

GeophyAI

tensorflow之scope使用

tensorflow中名称作用域和变量作用域

源码

实例

variable_scope

name_scope