模块、层和模型简介

2025-12-04 21:30:56

在 TensorFlow.org 上查看

在 Google Colab 中运行

在 GitHub 上查看源代码

下载笔记本

要在 TensorFlow 中进行机器学习，您可能需要定义、保存和恢复模型。

抽象地说，模型是

一个在张量上计算某些内容的函数（**前向传递**）

一些可以根据训练进行更新的变量

在本指南中，您将深入了解 Keras 的表面，了解 TensorFlow 模型的定义方式。这将介绍 TensorFlow 如何收集变量和模型，以及如何保存和恢复它们。

**注意：** 如果您想立即开始使用 Keras，请参阅 Keras 指南集合。

设置

import tensorflow as tf

import keras

from datetime import datetime

%load_ext tensorboard

2023-10-18 01:21:05.536666: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered

2023-10-18 01:21:05.536712: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered

2023-10-18 01:21:05.536766: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

TensorFlow 模块

大多数模型都是由层组成的。层是具有已知数学结构的函数，可以重复使用并具有可训练的变量。在 TensorFlow 中，大多数层和模型的高级实现（如 Keras 或 Sonnet）都是基于同一个基础类构建的：tf.Module。

构建模块

以下是一个非常简单的 tf.Module 示例，它对标量张量进行操作

class SimpleModule(tf.Module):

def __init__(self, name=None):

super().__init__(name=name)

self.a_variable = tf.Variable(5.0, name="train_me")

self.non_trainable_variable = tf.Variable(5.0, trainable=False, name="do_not_train_me")

def __call__(self, x):

return self.a_variable * x + self.non_trainable_variable

simple_module = SimpleModule(name="simple")

simple_module(tf.constant(5.0))

2023-10-18 01:21:08.181350: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://tensorflowcn.cn/install/gpu for how to download and setup the required libraries for your platform.

Skipping registering GPU devices...

模块以及扩展的层是深度学习术语中的“对象”：它们具有内部状态，以及使用该状态的方法。

关于 __call__ 没有什么特别之处，除了像 Python 可调用对象一样；您可以使用您想要的任何函数来调用您的模型。

您可以出于任何原因打开和关闭变量的可训练性，包括在微调期间冻结层和变量。

**注意：** tf.Module 是 tf.keras.layers.Layer 和 tf.keras.Model 的基类，因此您在这里遇到的所有内容也适用于 Keras。出于历史兼容性原因，Keras 层不会从模块中收集变量，因此您的模型应该只使用模块或只使用 Keras 层。但是，下面显示的用于检查变量的方法在两种情况下都是相同的。

通过子类化 tf.Module，分配给此对象属性的任何 tf.Variable 或 tf.Module 实例都会被自动收集。这允许您保存和加载变量，以及创建 tf.Module 的集合。

# All trainable variables

print("trainable variables:", simple_module.trainable_variables)

# Every variable

print("all variables:", simple_module.variables)

trainable variables: (,)

all variables: (, )

这是一个由模块组成的两层线性层模型的示例。

首先是一个密集（线性）层

class Dense(tf.Module):

def __init__(self, in_features, out_features, name=None):

super().__init__(name=name)

self.w = tf.Variable(

tf.random.normal([in_features, out_features]), name='w')

self.b = tf.Variable(tf.zeros([out_features]), name='b')

def __call__(self, x):

y = tf.matmul(x, self.w) + self.b

return tf.nn.relu(y)

然后是完整的模型，它创建两个层实例并应用它们

class SequentialModule(tf.Module):

def __init__(self, name=None):

super().__init__(name=name)

self.dense_1 = Dense(in_features=3, out_features=3)

self.dense_2 = Dense(in_features=3, out_features=2)

def __call__(self, x):

x = self.dense_1(x)

return self.dense_2(x)

# You have made a model!

my_model = SequentialModule(name="the_model")

# Call it, with random results

print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[0. 3.415034]], shape=(1, 2), dtype=float32)

tf.Module 实例会自动递归地收集分配给它的任何 tf.Variable 或 tf.Module 实例。这允许您使用单个模型实例管理 tf.Module 的集合，并保存和加载整个模型。

print("Submodules:", my_model.submodules)

Submodules: (<__main__.Dense object at 0x7f7931aea250>, <__main__.Dense object at 0x7f77ed5b8a00>)

for var in my_model.variables:

print(var, "\n")

array([[-2.8161757, -2.6065955, 1.9061812],

[-0.9430401, -0.4624743, -0.4531979],

[-1.3428234, 0.7062293, 0.7874674]], dtype=float32)>

array([[ 1.0474309 , -0.6920227 ],

[ 1.2405277 , 0.36411622],

[-1.6990206 , 0.762131 ]], dtype=float32)>

等待创建变量

您可能已经注意到，您必须为层定义输入和输出大小。这是为了使 w 变量具有已知的形状并可以分配。

通过将变量创建推迟到模块第一次使用特定输入形状调用时，您无需预先指定输入大小。

class FlexibleDenseModule(tf.Module):

# Note: No need for `in_features`

def __init__(self, out_features, name=None):

super().__init__(name=name)

self.is_built = False

self.out_features = out_features

def __call__(self, x):

# Create variables on first call.

if not self.is_built:

self.w = tf.Variable(

tf.random.normal([x.shape[-1], self.out_features]), name='w')

self.b = tf.Variable(tf.zeros([self.out_features]), name='b')

self.is_built = True

y = tf.matmul(x, self.w) + self.b

return tf.nn.relu(y)

# Used in a module

class MySequentialModule(tf.Module):

def __init__(self, name=None):

super().__init__(name=name)

self.dense_1 = FlexibleDenseModule(out_features=3)

self.dense_2 = FlexibleDenseModule(out_features=2)

def __call__(self, x):

x = self.dense_1(x)

return self.dense_2(x)

my_model = MySequentialModule(name="the_model")

print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

这种灵活性是 TensorFlow 层通常只需要指定其输出的形状的原因，例如在 tf.keras.layers.Dense 中，而不是同时指定输入和输出大小。

保存权重

您可以将 tf.Module 同时保存为检查点和 SavedModel。

检查点只是权重（即模块及其子模块内部变量集的值）

chkp_path = "my_checkpoint"

checkpoint = tf.train.Checkpoint(model=my_model)

checkpoint.write(chkp_path)

'my_checkpoint'

检查点包含两种类型的文件：数据本身和元数据的索引文件。索引文件跟踪实际保存的内容和检查点的编号，而检查点数据包含变量值及其属性查找路径。

ls my_checkpoint*

my_checkpoint.data-00000-of-00001 my_checkpoint.index

您可以查看检查点内部以确保保存了整个变量集合，并按包含它们的 Python 对象进行排序。

tf.train.list_variables(chkp_path)

[('_CHECKPOINTABLE_OBJECT_GRAPH', []),

('model/dense_1/b/.ATTRIBUTES/VARIABLE_VALUE', [3]),

('model/dense_1/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 3]),

('model/dense_2/b/.ATTRIBUTES/VARIABLE_VALUE', [2]),

('model/dense_2/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])]

在分布式（多机器）训练期间，它们可以被分片，这就是它们被编号的原因（例如，'00000-of-00001'）。在这种情况下，只有一个分片。

当您将模型加载回来时，您会覆盖 Python 对象中的值。

new_model = MySequentialModule()

new_checkpoint = tf.train.Checkpoint(model=new_model)

new_checkpoint.restore("my_checkpoint")

# Should be the same result as above

new_model(tf.constant([[2.0, 2.0, 2.0]]))

注意：由于检查点是长期训练工作流程的核心，tf.checkpoint.CheckpointManager 是一个辅助类，可以使检查点管理变得更加容易。有关更多详细信息，请参阅训练检查点指南。

保存函数

TensorFlow 可以运行没有原始 Python 对象的模型，如 TensorFlow Serving 和 TensorFlow Lite 所示，即使您从 TensorFlow Hub 下载了训练过的模型。

TensorFlow 需要知道如何执行 Python 中描述的计算，但不需要原始代码。为此，您可以创建一个图，这在图和函数介绍指南中有描述。

此图包含实现该函数的操作或运算。

您可以在上面的模型中通过添加 @tf.function 装饰器来定义一个图，以指示此代码应作为图运行。

class MySequentialModule(tf.Module):

def __init__(self, name=None):

super().__init__(name=name)

self.dense_1 = Dense(in_features=3, out_features=3)

self.dense_2 = Dense(in_features=3, out_features=2)

@tf.function

def __call__(self, x):

x = self.dense_1(x)

return self.dense_2(x)

# You have made a model with a graph!

my_model = MySequentialModule(name="the_model")

您创建的模块与以前的工作方式完全相同。传递给函数的每个唯一签名都会创建一个单独的图。有关详细信息，请查看图和函数介绍指南。

print(my_model([[2.0, 2.0, 2.0]]))

print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor([[0.31593648 0. ]], shape=(1, 2), dtype=float32)

tf.Tensor(

[[[0.31593648 0. ]

[0.31593648 0. ]]], shape=(1, 2, 2), dtype=float32)

您可以通过在 TensorBoard 摘要中跟踪图来可视化它。

# Set up logging.

stamp = datetime.now().strftime("%Y%m%d-%H%M%S")

logdir = "logs/func/%s" % stamp

writer = tf.summary.create_file_writer(logdir)

# Create a new model to get a fresh trace

# Otherwise the summary will not see the graph.

new_model = MySequentialModule()

# Bracket the function call with

# tf.summary.trace_on() and tf.summary.trace_export().

tf.summary.trace_on(graph=True)

tf.profiler.experimental.start(logdir)

# Call only one tf.function when tracing.

z = print(new_model(tf.constant([[2.0, 2.0, 2.0]])))

with writer.as_default():

tf.summary.trace_export(

name="my_func_trace",

step=0,

profiler_outdir=logdir)

tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)

启动 TensorBoard 以查看生成的跟踪

#docs_infra: no_execute

%tensorboard --logdir logs/func

创建 SavedModel

共享完全训练过的模型的推荐方法是使用 SavedModel。 SavedModel 包含函数集合和权重集合。

您可以将刚刚训练的模型保存如下

tf.saved_model.save(my_model, "the_saved_model")

INFO:tensorflow:Assets written to: the_saved_model/assets

# Inspect the SavedModel in the directory

ls -l the_saved_model

total 32

drwxr-sr-x 2 kbuilder kokoro 4096 Oct 18 01:21 assets

-rw-rw-r-- 1 kbuilder kokoro 58 Oct 18 01:21 fingerprint.pb

-rw-rw-r-- 1 kbuilder kokoro 17704 Oct 18 01:21 saved_model.pb

drwxr-sr-x 2 kbuilder kokoro 4096 Oct 18 01:21 variables

# The variables/ directory contains a checkpoint of the variables

ls -l the_saved_model/variables

total 8

-rw-rw-r-- 1 kbuilder kokoro 490 Oct 18 01:21 variables.data-00000-of-00001

-rw-rw-r-- 1 kbuilder kokoro 356 Oct 18 01:21 variables.index

saved_model.pb 文件是一个协议缓冲区，描述了功能性的 tf.Graph。

模型和层可以从这种表示中加载，而无需实际创建创建它的类的实例。这在您没有（或不需要）Python 解释器的情况下是可取的，例如大规模或边缘设备上的服务，或者在原始 Python 代码不可用或不实用时。

您可以将模型加载为新对象

new_model = tf.saved_model.load("the_saved_model")

new_model 是从加载的保存模型创建的，是一个内部 TensorFlow 用户对象，没有任何类知识。它不是 SequentialModule 类型。

isinstance(new_model, SequentialModule)

False

此新模型适用于已定义的输入签名。您无法向这样恢复的模型添加更多签名。

print(my_model([[2.0, 2.0, 2.0]]))

print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor([[0.31593648 0. ]], shape=(1, 2), dtype=float32)

tf.Tensor(

[[[0.31593648 0. ]

[0.31593648 0. ]]], shape=(1, 2, 2), dtype=float32)

因此，使用 SavedModel，您可以使用 tf.Module 保存 TensorFlow 权重和图，然后再次加载它们。

Keras 模型和层

请注意，到目前为止，还没有提到 Keras。您可以在 tf.Module 之上构建自己的高级 API，而且人们已经这样做了。

在本节中，您将研究 Keras 如何使用 tf.Module。Keras 模型的完整用户指南可以在 Keras 指南中找到。

Keras 层和模型具有更多额外功能，包括

可选损失

支持指标

内置支持可选的 training 参数，以区分训练和推理使用

保存和恢复 Python 对象，而不仅仅是黑盒函数

get_config 和 from_config 方法，允许您准确地存储配置以允许在 Python 中克隆模型

这些功能允许通过子类化创建更复杂的模型，例如自定义 GAN 或变分自动编码器 (VAE) 模型。在自定义层和模型完整指南中阅读有关它们的更多信息。

Keras 模型还附带了额外的功能，使它们易于训练、评估、加载、保存，甚至在多台机器上训练。

Keras 层

tf.keras.layers.Layer 是所有 Keras 层的基类，它继承自 tf.Module。

您可以通过交换父类，然后将 __call__ 更改为 call 来将模块转换为 Keras 层

class MyDense(tf.keras.layers.Layer):

# Adding **kwargs to support base Keras layer arguments

def __init__(self, in_features, out_features, **kwargs):

super().__init__(**kwargs)

# This will soon move to the build step; see below

self.w = tf.Variable(

tf.random.normal([in_features, out_features]), name='w')

self.b = tf.Variable(tf.zeros([out_features]), name='b')

def call(self, x):

y = tf.matmul(x, self.w) + self.b

return tf.nn.relu(y)

simple_layer = MyDense(name="simple", in_features=3, out_features=3)

Keras 层有自己的 __call__，它会执行下一节中描述的一些簿记工作，然后调用 call()。您应该注意到功能没有变化。

simple_layer([[2.0, 2.0, 2.0]])

build 步骤

如前所述，在许多情况下，将变量创建推迟到确定输入形状后是方便的。

Keras 层附带了一个额外的生命周期步骤，允许您在定义层的方式上更加灵活。这在 build 函数中定义。

build 恰好被调用一次，并且它使用输入的形状被调用。它通常用于创建变量（权重）。

您可以将上面的 MyDense 层重写为灵活地适应其输入的大小

class FlexibleDense(tf.keras.layers.Layer):

# Note the added `**kwargs`, as Keras supports many arguments

def __init__(self, out_features, **kwargs):

super().__init__(**kwargs)

self.out_features = out_features

def build(self, input_shape): # Create the state of the layer (weights)

self.w = tf.Variable(

tf.random.normal([input_shape[-1], self.out_features]), name='w')

self.b = tf.Variable(tf.zeros([self.out_features]), name='b')

def call(self, inputs): # Defines the computation from inputs to outputs

return tf.matmul(inputs, self.w) + self.b

# Create the instance of the layer

flexible_dense = FlexibleDense(out_features=3)

此时，模型尚未构建，因此没有变量

flexible_dense.variables

[]

调用该函数会分配大小合适的变量

# Call it, with predictably random results

print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0], [3.0, 3.0, 3.0]])))

Model results: tf.Tensor(

[[-2.531786 -5.5550847 -0.4248762]

[-3.7976792 -8.332626 -0.6373143]], shape=(2, 3), dtype=float32)

flexible_dense.variables

[

array([[-0.77719826, -1.9281565 , 0.82326293],

[ 0.85628736, -0.31845194, 0.10916236],

[-1.3449821 , -0.5309338 , -1.1448634 ]], dtype=float32)>,

]

由于 build 仅被调用一次，因此如果输入形状与层的变量不兼容，则会拒绝输入

try:

print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0, 2.0]])))

except tf.errors.InvalidArgumentError as e:

print("Failed:", e)

Failed: Exception encountered when calling layer 'flexible_dense' (type FlexibleDense).

{ {function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:CPU:0} } Matrix size-incompatible: In[0]: [1,4], In[1]: [3,3] [Op:MatMul] name:

Call arguments received by layer 'flexible_dense' (type FlexibleDense):

• inputs=tf.Tensor(shape=(1, 4), dtype=float32)

Keras 模型

您可以将模型定义为嵌套的 Keras 层。

但是，Keras 还提供了一个功能齐全的模型类，称为 tf.keras.Model。它继承自 tf.keras.layers.Layer，因此 Keras 模型可以使用和嵌套的方式与 Keras 层相同。Keras 模型附带了额外的功能，使它们易于训练、评估、加载、保存，甚至在多台机器上训练。

您可以使用几乎相同的代码定义上面的 SequentialModule，再次将 __call__ 转换为 call() 并更改父类

@keras.saving.register_keras_serializable()

class MySequentialModel(tf.keras.Model):

def __init__(self, name=None, **kwargs):

super().__init__(**kwargs)

self.dense_1 = FlexibleDense(out_features=3)

self.dense_2 = FlexibleDense(out_features=2)

def call(self, x):

x = self.dense_1(x)

return self.dense_2(x)

# You have made a Keras model!

my_sequential_model = MySequentialModel(name="the_model")

# Call it on a tensor, with random results

print("Model results:", my_sequential_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[ 0.26034355 16.431221 ]], shape=(1, 2), dtype=float32)

所有相同的功能都可用，包括跟踪变量和子模块。

注意：嵌套在 Keras 层或模型中的原始 tf.Module 不会为训练或保存收集其变量。相反，将 Keras 层嵌套在 Keras 层中。my_sequential_model.variables

[

array([[ 1.4749854 , 0.16090827, 2.2669017 ],

[ 1.6850946 , 1.1545411 , 0.1707306 ],

[ 0.8753734 , -0.13549292, 0.08751986]], dtype=float32)>,

array([[-0.8022977 , 1.9773549 ],

[-0.76657015, -0.8485579 ],

[ 1.6919082 , 0.49000967]], dtype=float32)>,

]

my_sequential_model.submodules

(<__main__.FlexibleDense at 0x7f790c7e0e80>,

<__main__.FlexibleDense at 0x7f790c7e6940>)

覆盖 tf.keras.Model 是构建 TensorFlow 模型的一种非常 Pythonic 的方法。如果您正在从其他框架迁移模型，这将非常简单。

如果您正在构建简单的现有层和输入组合的模型，您可以通过使用函数式 API 来节省时间和空间，该 API 附带了有关模型重建和架构的额外功能。

以下是使用函数式 API 的相同模型

inputs = tf.keras.Input(shape=[3,])

x = FlexibleDense(3)(inputs)

x = FlexibleDense(2)(x)

my_functional_model = tf.keras.Model(inputs=inputs, outputs=x)

my_functional_model.summary()

Model: "model"

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

input_1 (InputLayer) [(None, 3)] 0

flexible_dense_3 (Flexible (None, 3) 12

Dense)

flexible_dense_4 (Flexible (None, 2) 8

Dense)

=================================================================

Total params: 20 (80.00 Byte)

Trainable params: 20 (80.00 Byte)

Non-trainable params: 0 (0.00 Byte)

_________________________________________________________________

my_functional_model(tf.constant([[2.0, 2.0, 2.0]]))

这里的主要区别在于输入形状是在函数式构建过程的开始就指定的。在这种情况下，input_shape 参数不必完全指定；您可以将某些维度保留为 None。

注意：您无需在子类化模型中指定 input_shape 或 InputLayer；这些参数和层将被忽略。

保存 Keras 模型

Keras 模型有自己的专用 zip 存档保存格式，以 .keras 扩展名标记。当调用 tf.keras.Model.save 时，请在文件名中添加 .keras 扩展名。例如

my_sequential_model.save("exname_of_file.keras")

同样容易，它们可以被加载回来

reconstructed_model = tf.keras.models.load_model("exname_of_file.keras")

Keras zip 存档（.keras 文件）还保存指标、损失和优化器状态。

此重建的模型可以使用，并且在对相同数据调用时会产生相同的结果

reconstructed_model(tf.constant([[2.0, 2.0, 2.0]]))

检查 Keras 模型

Keras 模型也可以进行检查点，这与 tf.Module 相似。

关于 Keras 模型的保存和序列化还有更多内容，包括为自定义层提供配置方法以支持功能。请查看保存和序列化的指南。

下一步

如果您想了解更多关于 Keras 的详细信息，您可以查看现有的 Keras 指南这里。

另一个基于 tf.module 构建的高级 API 示例是来自 DeepMind 的 Sonnet，它在他们的网站上有介绍。

如何与喜欢的人在一起？心理学告诉你答案！

“搉扬”的读音、意思、造句

上一届世界杯_1974年世界杯 - cdgzsh.com

模块、层和模型简介

Copyright © 2022 上一届世界杯_1974年世界杯 - cdgzsh.com All Rights Reserved.