Python 深度学习 第7章 深入研究 Keras
内容概要
第7章深入探讨了Keras的高级功能,包括模型构建、训练和评估的不同方法。本章详细介绍了Keras的三种模型构建方式(Sequential模型、Functional API和Model子类化),以及如何使用内置的训练和评估循环、自定义训练循环和TensorBoard进行监控。通过本章,读者将掌握Keras的高级用法,为解决复杂问题做好准备。

主要内容
-
Keras模型构建方式
- Sequential模型:适用于简单的层堆叠模型。
- Functional API:适用于多输入、多输出和复杂模型结构。
- Model子类化:适用于需要完全控制模型行为的场景。
-
使用内置训练和评估循环
- 编译模型:使用
compile()方法指定优化器、损失函数和评估指标。 - 训练模型:使用
fit()方法进行训练。 - 评估和预测:使用
evaluate()和predict()方法。
- 编译模型:使用
-
自定义训练循环
- 低级训练循环:手动实现前向传播、反向传播和权重更新。
- 使用
tf.function优化性能:将训练步骤编译为计算图以提高速度。
-
使用Keras回调
- 内置回调:如
EarlyStopping和ModelCheckpoint。 - 自定义回调:通过继承
keras.callbacks.Callback类实现特定功能。
- 内置回调:如
-
监控和可视化
- TensorBoard:用于监控训练过程、可视化模型结构和指标。
关键代码和算法
6.2.1 Sequential模型
from tensorflow import keras
from tensorflow.keras import layersmodel = keras.Sequential([layers.Dense(64, activation="relu"),layers.Dense(10, activation="softmax")
])# 逐步构建模型
model = keras.Sequential()
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))# 指定输入形状
model = keras.Sequential()
model.add(keras.Input(shape=(3,)))
model.add(layers.Dense(64, activation="relu"))
model.summary()
6.2.2 Functional API
inputs = keras.Input(shape=(3,), name="my_input")
features = layers.Dense(64, activation="relu")(inputs)
outputs = layers.Dense(10, activation="softmax")(features)
model = keras.Model(inputs=inputs, outputs=outputs)# 多输入多输出模型
vocabulary_size = 10000
num_tags = 100
num_departments = 4title = keras.Input(shape=(vocabulary_size,), name="title")
text_body = keras.Input(shape=(vocabulary_size,), name="text_body")
tags = keras.Input(shape=(num_tags,), name="tags")features = layers.Concatenate()([title, text_body, tags])
features = layers.Dense(64, activation="relu")(features)priority = layers.Dense(1, activation="sigmoid", name="priority")(features)
department = layers.Dense(num_departments, activation="softmax", name="department")(features)model = keras.Model(inputs=[title, text_body, tags], outputs=[priority, department])
6.2.3 Model子类化
class CustomerTicketModel(keras.Model):def __init__(self, num_departments):super().__init__()self.concat_layer = layers.Concatenate()self.mixing_layer = layers.Dense(64, activation="relu")self.priority_scorer = layers.Dense(1, activation="sigmoid")self.department_classifier = layers.Dense(num_departments, activation="softmax")def call(self, inputs):title = inputs["title"]text_body = inputs["text_body"]tags = inputs["tags"]features = self.concat_layer([title, text_body, tags])features = self.mixing_layer(features)priority = self.priority_scorer(features)department = self.department_classifier(features)return priority, departmentmodel = CustomerTicketModel(num_departments=4)
priority, department = model({"title": title_data, "text_body": text_body_data, "tags": tags_data})
6.3.1 自定义指标
class RootMeanSquaredError(keras.metrics.Metric):def __init__(self, name="rmse", **kwargs):super().__init__(name=name, **kwargs)self.mse_sum = self.add_weight(name="mse_sum", initializer="zeros")self.total_samples = self.add_weight(name="total_samples", initializer="zeros", dtype="int32")def update_state(self, y_true, y_pred, sample_weight=None):y_true = tf.one_hot(y_true, depth=tf.shape(y_pred)[1])mse = tf.reduce_sum(tf.square(y_true - y_pred))self.mse_sum.assign_add(mse)num_samples = tf.shape(y_pred)[0]self.total_samples.assign_add(num_samples)def result(self):return tf.sqrt(self.mse_sum / tf.cast(self.total_samples, tf.float32))def reset_state(self):self.mse_sum.assign(0.)self.total_samples.assign(0)model = get_mnist_model()
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy", RootMeanSquaredError()])
model.fit(train_images, train_labels, epochs=3, validation_data=(val_images, val_labels))
6.3.2 使用回调
callbacks_list = [keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=2),keras.callbacks.ModelCheckpoint(filepath="checkpoint_path.keras", monitor="val_loss", save_best_only=True)
]model = get_mnist_model()
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=10, callbacks=callbacks_list, validation_data=(val_images, val_labels))
6.3.4 使用TensorBoard
tensorboard = keras.callbacks.TensorBoard(log_dir="/full_path_to_your_log_dir")
model.fit(train_images, train_labels, epochs=10, validation_data=(val_images, val_labels), callbacks=[tensorboard])
6.4.3 自定义训练循环
model = get_mnist_model()
loss_fn = keras.losses.SparseCategoricalCrossentropy()
optimizer = keras.optimizers.RMSprop()
metrics = [keras.metrics.SparseCategoricalAccuracy()]
loss_tracking_metric = keras.metrics.Mean()def train_step(inputs, targets):with tf.GradientTape() as tape:predictions = model(inputs, training=True)loss = loss_fn(targets, predictions)gradients = tape.gradient(loss, model.trainable_weights)optimizer.apply_gradients(zip(gradients, model.trainable_weights))logs = {}for metric in metrics:metric.update_state(targets, predictions)logs[metric.name] = metric.result()loss_tracking_metric.update_state(loss)logs["loss"] = loss_tracking_metric.result()return logsdef reset_metrics():for metric in metrics:metric.reset_state()loss_tracking_metric.reset_state()training_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
training_dataset = training_dataset.batch(32)epochs = 3
for epoch in range(epochs):reset_metrics()for inputs_batch, targets_batch in training_dataset:logs = train_step(inputs_batch, targets_batch)print(f"Results at the end of epoch {epoch}")for key, value in logs.items():print(f"...{key}: {value:.4f}")
精彩语录
-
中文:Keras提供了从简单到复杂的多种工作流程,所有工作流程都可以无缝协作。
英文原文:Keras offers a spectrum of different workflows, based on the principle of progressive disclosure of complexity. They all smoothly inter-operate together.
解释:这句话强调了Keras的灵活性和模块化设计。 -
中文:Functional API在易用性和灵活性之间提供了良好的折中。
英文原文:The Functional API provides you with a pretty good trade-off between ease of use and flexibility.
解释:这句话总结了Functional API的优势,适用于大多数复杂模型。 -
中文:使用
tf.function可以显著提高训练速度。
英文原文:Usingtf.functioncan make your custom loops run significantly faster.
解释:这句话强调了TensorFlow函数编译对性能的提升。 -
中文:通过覆盖
train_step方法,可以自定义训练逻辑。
英文原文:You can provide a custom training step function and let the framework do the rest.
解释:这句话介绍了如何在使用fit()的同时实现自定义训练算法。 -
中文:TensorBoard是监控和可视化训练过程的最佳工具。
英文原文:TensorBoard is the best way to monitor everything that goes on inside your model during training.
解释:这句话强调了TensorBoard在模型开发中的重要性。
总结
通过本章的学习,读者将掌握Keras的高级功能,包括如何构建复杂模型、自定义训练循环和使用TensorBoard进行监控。这些知识将为解决实际问题提供强大的工具。
