mirror of
https://github.com/fchollet/deep-learning-with-python-notebooks.git
synced 2021-07-27 01:28:40 +03:00
Switch to 2nd edition notebooks -- let's go!
This commit is contained in:
2
LICENSE
2
LICENSE
@@ -1,6 +1,6 @@
|
|||||||
MIT License
|
MIT License
|
||||||
|
|
||||||
Copyright (c) 2017 François Chollet
|
Copyright (c) 2017-present François Chollet
|
||||||
|
|
||||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||||
of this software and associated documentation files (the "Software"), to deal
|
of this software and associated documentation files (the "Software"), to deal
|
||||||
|
|||||||
57
README.md
57
README.md
@@ -1,34 +1,35 @@
|
|||||||
# Companion Jupyter notebooks for the book "Deep Learning with Python"
|
# Companion Jupyter notebooks for the book "Deep Learning with Python"
|
||||||
|
|
||||||
This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python (Manning Publications)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text of the book features far more content than you will find in these notebooks, in particular further explanations and figures. Here we have only included the code samples themselves and immediately related surrounding comments.
|
This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, 2nd Edition (Manning Publications)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff).
|
||||||
|
|
||||||
These notebooks use Python 3.6 and Keras 2.0.8. They were generated on a p2.xlarge EC2 instance.
|
For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode.
|
||||||
|
**If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.**
|
||||||
|
|
||||||
|
These notebooks use Python 3.7 and Keras 2.0.8. They were generated on a p2.xlarge EC2 instance.
|
||||||
|
|
||||||
## Table of contents
|
## Table of contents
|
||||||
|
|
||||||
* Chapter 2:
|
* [Chapter 2: The mathematical building blocks of neural networks](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb)
|
||||||
* [2.1: A first look at a neural network](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/2.1-a-first-look-at-a-neural-network.ipynb)
|
* [Chapter 3: Introduction to Keras and TensorFlow](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-keras-and-tf.ipynb)
|
||||||
* Chapter 3:
|
* [Chapter 4: Getting started with neural networks: classification and regression](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_getting-started-with-neural-networks.ipynb)
|
||||||
* [3.5: Classifying movie reviews](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb)
|
* [Chapter 5: Fundamentals of machine learning](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb)
|
||||||
* [3.6: Classifying newswires](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.6-classifying-newswires.ipynb)
|
* [Chapter 7: Working with Keras: a deep dive](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_working-with-keras.ipynb)
|
||||||
* [3.7: Predicting house prices](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.7-predicting-house-prices.ipynb)
|
* [Chapter 8: Introduction to deep learning for computer vision](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_intro-to-dl-for-computer-vision.ipynb)
|
||||||
* Chapter 4:
|
* Chapter 9: Advanced deep learning for computer vision
|
||||||
* [4.4: Underfitting and overfitting](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/4.4-overfitting-and-underfitting.ipynb)
|
- [Part 1: Image segmentation](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part01_image-segmentation.ipynb)
|
||||||
* Chapter 5:
|
- [Part 2: Modern convnet architecture patterns](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part02_modern-convnet-architecture-patterns.ipynb)
|
||||||
* [5.1: Introduction to convnets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.1-introduction-to-convnets.ipynb)
|
- [Part 3: Interpreting what convnets learn](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_part03_interpreting-what-convnets-learn.ipynb)
|
||||||
* [5.2: Using convnets with small datasets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.2-using-convnets-with-small-datasets.ipynb)
|
* [Chapter 10: Deep learning for timeseries](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_dl-for-timeseries.ipynb)
|
||||||
* [5.3: Using a pre-trained convnet](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.3-using-a-pretrained-convnet.ipynb)
|
* Chapter 11: Deep learning for text
|
||||||
* [5.4: Visualizing what convnets learn](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.4-visualizing-what-convnets-learn.ipynb)
|
- [Part 1: Introduction](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part01_introduction.ipynb)
|
||||||
* Chapter 6:
|
- [Part 2: Sequence models](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part02_sequence-models.ipynb)
|
||||||
* [6.1: One-hot encoding of words or characters](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-one-hot-encoding-of-words-or-characters.ipynb)
|
- [Part 3: Transformer](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part03_transformer.ipynb)
|
||||||
* [6.1: Using word embeddings](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-using-word-embeddings.ipynb)
|
- [Part 4: Sequence-to-sequence learning](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_part04_sequence-to-sequence-learning.ipynb)
|
||||||
* [6.2: Understanding RNNs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.2-understanding-recurrent-neural-networks.ipynb)
|
* Chapter 12: Generative deep learning
|
||||||
* [6.3: Advanced usage of RNNs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.3-advanced-usage-of-recurrent-neural-networks.ipynb)
|
- [Part 1: Text generation](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part01_text-generation.ipynb)
|
||||||
* [6.4: Sequence processing with convnets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.4-sequence-processing-with-convnets.ipynb)
|
- [Part 2: Deep Dream](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part02_deep-dream.ipynb)
|
||||||
* Chapter 8:
|
- [Part 3: Neural style transfer](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part03_neural-style-transfer.ipynb)
|
||||||
* [8.1: Text generation with LSTM](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.1-text-generation-with-lstm.ipynb)
|
- [Part 4: Variational autoencoders](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part04_variational-autoencoders.ipynb)
|
||||||
* [8.2: Deep dream](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.2-deep-dream.ipynb)
|
- [Part 5: Generative adversarial networks](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_part05_gans.ipynb)
|
||||||
* [8.3: Neural style transfer](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.3-neural-style-transfer.ipynb)
|
* [Chapter 13: Best practices for the real world](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_best-practices-for-the-real-world.ipynb)
|
||||||
* [8.4: Generating images with VAEs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.4-generating-images-with-vaes.ipynb)
|
* [Chapter 14: Conclusions](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_conclusions.ipynb)
|
||||||
* [8.5: Introduction to GANs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.5-introduction-to-gans.ipynb
|
|
||||||
)
|
|
||||||
|
|||||||
1468
chapter02_mathematical-building-blocks.ipynb
Normal file
1468
chapter02_mathematical-building-blocks.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
977
chapter03_introduction-to-keras-and-tf.ipynb
Normal file
977
chapter03_introduction-to-keras-and-tf.ipynb
Normal file
@@ -0,0 +1,977 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Introduction to Keras and TensorFlow"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## What's TensorFlow?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## What's Keras?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Keras and TensorFlow: a brief history"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Setting up a deep-learning workspace"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Jupyter notebooks: the preferred way to run deep-learning experiments"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Using Colaboratory"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### First steps with Colaboratory"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Installing packages with `pip`"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Using the GPU runtime"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## First steps with TensorFlow"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Constant tensors and Variables"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**All-ones or all-zeros tensors**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"x = tf.ones(shape=(2, 1))\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"x = tf.zeros(shape=(2, 1))\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Random tensors**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n",
|
||||||
|
"print(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**NumPy arrays are assignable**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"x = np.ones(shape=(2, 2))\n",
|
||||||
|
"x[0, 0] = 0."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Creating a Variable**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n",
|
||||||
|
"print(v)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Assigning a value to a Variable**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"v.assign(tf.ones((3, 1)))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Assigning a value to a subset of a Variable**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"v[0, 0].assign(3.)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using assign_add**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"v.assign_add(tf.ones((3, 1)))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Tensor operations: doing math in TensorFlow"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A few basic math operations**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"a = tf.ones((2, 2))\n",
|
||||||
|
"b = tf.square(a)\n",
|
||||||
|
"c = tf.sqrt(a)\n",
|
||||||
|
"d = b + c\n",
|
||||||
|
"e = tf.matmul(a, b)\n",
|
||||||
|
"e *= d"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### A second look at the `GradientTape` API"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using the GradientTape**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"input_var = tf.Variable(initial_value=3.)\n",
|
||||||
|
"with tf.GradientTape() as tape:\n",
|
||||||
|
" result = tf.square(input_var)\n",
|
||||||
|
"gradient = tape.gradient(result, input_var)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using the GradientTape with constant tensor inputs**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"input_const = tf.constant(3.)\n",
|
||||||
|
"with tf.GradientTape() as tape:\n",
|
||||||
|
" tape.watch(input_const)\n",
|
||||||
|
" result = tf.square(input_const)\n",
|
||||||
|
"gradient = tape.gradient(result, input_const)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using nested gradient tapes to compute second-order gradients**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"time = tf.Variable(0.)\n",
|
||||||
|
"with tf.GradientTape() as outer_tape:\n",
|
||||||
|
" with tf.GradientTape() as inner_tape:\n",
|
||||||
|
" position = 4.9 * time ** 2\n",
|
||||||
|
" speed = inner_tape.gradient(position, time)\n",
|
||||||
|
"acceleration = outer_tape.gradient(speed, time)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### An end-to-end example: a linear classifier in pure TensorFlow"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Generating two classes of random points in a 2D plane**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"num_samples_per_class = 1000\n",
|
||||||
|
"negative_samples = np.random.multivariate_normal(\n",
|
||||||
|
" mean=[0, 3], cov=[[1, 0.5],[0.5, 1]], size=num_samples_per_class)\n",
|
||||||
|
"positive_samples = np.random.multivariate_normal(\n",
|
||||||
|
" mean=[3, 0], cov=[[1, 0.5],[0.5, 1]], size=num_samples_per_class)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Stacking the two classes into an array with shape (2000, 2)**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Generating the corresponding targets (0 and 1)**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"targets = np.vstack((np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n",
|
||||||
|
" np.ones((num_samples_per_class, 1), dtype=\"float32\")))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Plotting the two point classes**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n",
|
||||||
|
"plt.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Creating the linear classifier variables**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"input_dim = 2\n",
|
||||||
|
"output_dim = 1\n",
|
||||||
|
"W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n",
|
||||||
|
"b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The forward pass function**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def model(inputs):\n",
|
||||||
|
" return tf.matmul(inputs, W) + b"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The mean squared error loss function**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def square_loss(targets, predictions):\n",
|
||||||
|
" per_sample_losses = tf.square(targets - predictions)\n",
|
||||||
|
" return tf.reduce_mean(per_sample_losses)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The training step function**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"learning_rate = 0.1\n",
|
||||||
|
"\n",
|
||||||
|
"def training_step(inputs, targets):\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" predictions = model(inputs)\n",
|
||||||
|
" loss = square_loss(predictions, targets)\n",
|
||||||
|
" grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n",
|
||||||
|
" W.assign_sub(grad_loss_wrt_W * learning_rate)\n",
|
||||||
|
" b.assign_sub(grad_loss_wrt_b * learning_rate)\n",
|
||||||
|
" return loss"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The batch training loop**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for step in range(20):\n",
|
||||||
|
" loss = training_step(inputs, targets)\n",
|
||||||
|
" print(f\"Loss at step {step}: {loss:.4f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"predictions = model(inputs)\n",
|
||||||
|
"plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n",
|
||||||
|
"plt.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"x = np.linspace(-1, 4, 100)\n",
|
||||||
|
"y = - W[0] / W[1] * x + (0.5 - b) / W[1]\n",
|
||||||
|
"plt.plot(x, y, \"-r\")\n",
|
||||||
|
"plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Anatomy of a neural network: understanding core Keras APIs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Layers: the building blocks of deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### The base `Layer` class in Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"\n",
|
||||||
|
"class SimpleDense(keras.layers.Layer):\n",
|
||||||
|
"\n",
|
||||||
|
" def __init__(self, units, activation=None):\n",
|
||||||
|
" super().__init__()\n",
|
||||||
|
" self.units = units\n",
|
||||||
|
" self.activation = activation\n",
|
||||||
|
"\n",
|
||||||
|
" def build(self, input_shape):\n",
|
||||||
|
" input_dim = input_shape[-1]\n",
|
||||||
|
" self.W = self.add_weight(shape=(input_dim, self.units),\n",
|
||||||
|
" initializer=\"random_normal\")\n",
|
||||||
|
" self.b = self.add_weight(shape=(self.units,),\n",
|
||||||
|
" initializer=\"zeros\")\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs):\n",
|
||||||
|
" y = tf.matmul(inputs, self.W) + self.b\n",
|
||||||
|
" if self.activation is not None:\n",
|
||||||
|
" y = self.activation(y)\n",
|
||||||
|
" return y"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"my_dense = SimpleDense(units=32, activation=tf.nn.relu)\n",
|
||||||
|
"input_tensor = tf.ones(shape=(2, 784))\n",
|
||||||
|
"output_tensor = my_dense(input_tensor)\n",
|
||||||
|
"print(output_tensor.shape)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Automatic shape inference: building layers on the fly"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"layer = layers.Dense(32, activation=\"relu\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import models\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"model = models.Sequential([\n",
|
||||||
|
" layers.Dense(32, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(32)\n",
|
||||||
|
"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" SimpleDense(32, activation=\"relu\"),\n",
|
||||||
|
" SimpleDense(64, activation=\"relu\"),\n",
|
||||||
|
" SimpleDense(32, activation=\"relu\"),\n",
|
||||||
|
" SimpleDense(10, activation=\"softmax\")\n",
|
||||||
|
"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### From layers to models"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The \"compile\" step: configuring the learning process"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([keras.layers.Dense(1)])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"mean_squared_error\",\n",
|
||||||
|
" metrics=[\"accuracy\"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.compile(optimizer=keras.optimizers.RMSprop(),\n",
|
||||||
|
" loss=keras.losses.MeanSquaredError(),\n",
|
||||||
|
" metrics=[keras.metrics.BinaryAccuracy()])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Picking a loss function"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Understanding the \"fit\" method"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Calling `fit` with NumPy data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"history = model.fit(\n",
|
||||||
|
" inputs,\n",
|
||||||
|
" targets,\n",
|
||||||
|
" epochs=5,\n",
|
||||||
|
" batch_size=128\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"history.history"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Monitoring loss & metrics on validation data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using the validation data argument**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([keras.layers.Dense(1)])\n",
|
||||||
|
"model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n",
|
||||||
|
" loss=keras.losses.MeanSquaredError(),\n",
|
||||||
|
" metrics=[keras.metrics.BinaryAccuracy()])\n",
|
||||||
|
"\n",
|
||||||
|
"indices_permutation = np.random.permutation(len(inputs))\n",
|
||||||
|
"shuffled_inputs = inputs[indices_permutation]\n",
|
||||||
|
"shuffled_targets = targets[indices_permutation]\n",
|
||||||
|
"\n",
|
||||||
|
"num_validation_samples = int(0.3 * len(inputs))\n",
|
||||||
|
"val_inputs = shuffled_inputs[-num_validation_samples:]\n",
|
||||||
|
"val_targets = shuffled_targets[-num_validation_samples:]\n",
|
||||||
|
"training_inputs = shuffled_inputs[:num_validation_samples]\n",
|
||||||
|
"training_targets = shuffled_targets[:num_validation_samples]\n",
|
||||||
|
"model.fit(\n",
|
||||||
|
" training_inputs,\n",
|
||||||
|
" training_targets,\n",
|
||||||
|
" epochs=5,\n",
|
||||||
|
" batch_size=16,\n",
|
||||||
|
" validation_data=(val_inputs, val_targets)\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Inference: using a model after training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"predictions = model.predict(val_inputs, batch_size=128)\n",
|
||||||
|
"print(predictions[:10])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter03_introduction-to-keras-and-tf.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
1430
chapter04_getting-started-with-neural-networks.ipynb
Normal file
1430
chapter04_getting-started-with-neural-networks.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
786
chapter05_fundamentals-of-ml.ipynb
Normal file
786
chapter05_fundamentals-of-ml.ipynb
Normal file
@@ -0,0 +1,786 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Fundamentals of machine learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Generalization: the goal of machine learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Underfitting and overfitting"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Noisy training data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Ambiguous features"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Rare features and spurious correlations"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Adding white-noise channels or all-zeros channels to MNIST**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.datasets import mnist\n",
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"(train_images, train_labels), _ = mnist.load_data()\n",
|
||||||
|
"train_images = train_images.reshape((60000, 28 * 28))\n",
|
||||||
|
"train_images = train_images.astype(\"float32\") / 255\n",
|
||||||
|
"\n",
|
||||||
|
"train_images_with_noise_channels = np.concatenate(\n",
|
||||||
|
" [train_images, np.random.random((len(train_images), 784))], axis=1)\n",
|
||||||
|
"\n",
|
||||||
|
"train_images_with_zeros_channels = np.concatenate(\n",
|
||||||
|
" [train_images, np.zeros((len(train_images), 784))], axis=1)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training the same model on MNIST data with noise channels or all-zero channels**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"def get_model():\n",
|
||||||
|
" model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\")\n",
|
||||||
|
" ])\n",
|
||||||
|
" model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
" return model\n",
|
||||||
|
"\n",
|
||||||
|
"model = get_model()\n",
|
||||||
|
"history_noise = model.fit(\n",
|
||||||
|
" train_images_with_noise_channels, train_labels,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)\n",
|
||||||
|
"\n",
|
||||||
|
"model = get_model()\n",
|
||||||
|
"history_zeros = model.fit(\n",
|
||||||
|
" train_images_with_zeros_channels, train_labels,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Plotting a validation accuracy comparison**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"val_acc_noise = history_noise.history[\"val_accuracy\"]\n",
|
||||||
|
"val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n",
|
||||||
|
"epochs = range(1, 11)\n",
|
||||||
|
"plt.plot(epochs, val_acc_noise, \"b-\",\n",
|
||||||
|
" label=\"Validation accuracy with noise channels\")\n",
|
||||||
|
"plt.plot(epochs, val_acc_zeros, \"b--\",\n",
|
||||||
|
" label=\"Validation accuracy with zeros channels\")\n",
|
||||||
|
"plt.title(\"Effect of noise channels on validation accuracy\")\n",
|
||||||
|
"plt.xlabel(\"Epochs\")\n",
|
||||||
|
"plt.ylabel(\"Accuracy\")\n",
|
||||||
|
"plt.legend()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The nature of generalization in deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Fitting a MNIST model with randomly shuffled labels**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"(train_images, train_labels), _ = mnist.load_data()\n",
|
||||||
|
"train_images = train_images.reshape((60000, 28 * 28))\n",
|
||||||
|
"train_images = train_images.astype(\"float32\") / 255\n",
|
||||||
|
"\n",
|
||||||
|
"random_train_labels = train_labels[:]\n",
|
||||||
|
"np.random.shuffle(random_train_labels)\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.fit(train_images, random_train_labels,\n",
|
||||||
|
" epochs=100,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### The manifold hypothesis"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Interpolation as a source of generalization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Why deep learning works"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Training data is paramount"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Evaluating machine-learning models"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Training, validation, and test sets"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Simple hold-out validation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### K-fold validation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Iterated K-fold validation with shuffling"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Beating a common-sense baseline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Things to keep in mind about model evaluation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Improving model fit"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Tuning key gradient descent parameters"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training a MNIST model with an incorrectly high learning rate**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"(train_images, train_labels), _ = mnist.load_data()\n",
|
||||||
|
"train_images = train_images.reshape((60000, 28 * 28))\n",
|
||||||
|
"train_images = train_images.astype(\"float32\") / 255\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=keras.optimizers.RMSprop(1.),\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.fit(train_images, train_labels,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The same model with a more appropriate learning rate**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=keras.optimizers.RMSprop(1e-2),\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.fit(train_images, train_labels,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Leveraging better architecture priors"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Increasing model capacity"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A simple logistic regression on MNIST**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_small_model = model.fit(\n",
|
||||||
|
" train_images, train_labels,\n",
|
||||||
|
" epochs=20,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"val_loss = history_small_model.history[\"val_loss\"]\n",
|
||||||
|
"epochs = range(1, 21)\n",
|
||||||
|
"plt.plot(epochs, val_loss, \"b--\",\n",
|
||||||
|
" label=\"Validation loss\")\n",
|
||||||
|
"plt.title(\"Effect of insufficient model capacity on validation loss\")\n",
|
||||||
|
"plt.xlabel(\"Epochs\")\n",
|
||||||
|
"plt.ylabel(\"Loss\")\n",
|
||||||
|
"plt.legend()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(96, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(96, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\"),\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_large_model = model.fit(\n",
|
||||||
|
" train_images, train_labels,\n",
|
||||||
|
" epochs=20,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" validation_split=0.2)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Improving generalization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Dataset curation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Feature engineering"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Using early stopping"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Regularizing your model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Reducing the network's size"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Original model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.datasets import imdb\n",
|
||||||
|
"(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n",
|
||||||
|
"\n",
|
||||||
|
"def vectorize_sequences(sequences, dimension=10000):\n",
|
||||||
|
" results = np.zeros((len(sequences), dimension))\n",
|
||||||
|
" for i, sequence in enumerate(sequences):\n",
|
||||||
|
" results[i, sequence] = 1.\n",
|
||||||
|
" return results\n",
|
||||||
|
"train_data = vectorize_sequences(train_data)\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(16, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(16, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_original = model.fit(train_data, train_labels,\n",
|
||||||
|
" epochs=20, batch_size=512, validation_split=0.4)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Version of the model with lower capacity**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(4, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(4, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_smaller_model = model.fit(\n",
|
||||||
|
" train_data, train_labels,\n",
|
||||||
|
" epochs=20, batch_size=512, validation_split=0.4)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Version of the model with higher capacity**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(512, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_larger_model = model.fit(\n",
|
||||||
|
" train_data, train_labels,\n",
|
||||||
|
" epochs=20, batch_size=512, validation_split=0.4)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Adding weight regularization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Adding L2 weight regularization to the model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import regularizers\n",
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(16,\n",
|
||||||
|
" kernel_regularizer=regularizers.l2(0.002),\n",
|
||||||
|
" activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(16,\n",
|
||||||
|
" kernel_regularizer=regularizers.l2(0.002),\n",
|
||||||
|
" activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_l2_reg = model.fit(\n",
|
||||||
|
" train_data, train_labels,\n",
|
||||||
|
" epochs=20, batch_size=512, validation_split=0.4)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Different weight regularizers available in Keras**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from\u00a0keras\u00a0import\u00a0regularizers\n",
|
||||||
|
"regularizers.l1(0.001)\n",
|
||||||
|
"regularizers.l1_l2(l1=0.001,\u00a0l2=0.001)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Adding dropout"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Adding dropout to the IMDB model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(16, activation=\"relu\"),\n",
|
||||||
|
" layers.Dropout(0.5),\n",
|
||||||
|
" layers.Dense(16, activation=\"relu\"),\n",
|
||||||
|
" layers.Dropout(0.5),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\")\n",
|
||||||
|
"])\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"history_dropout = model.fit(\n",
|
||||||
|
" train_data, train_labels,\n",
|
||||||
|
" epochs=20, batch_size=512, validation_split=0.4)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter05_fundamentals-of-ml.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
1419
chapter07_working-with-keras.ipynb
Normal file
1419
chapter07_working-with-keras.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
1248
chapter08_intro-to-dl-for-computer-vision.ipynb
Normal file
1248
chapter08_intro-to-dl-for-computer-vision.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
282
chapter09_part01_image-segmentation.ipynb
Normal file
282
chapter09_part01_image-segmentation.ipynb
Normal file
@@ -0,0 +1,282 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Advanced deep learning for computer vision"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Three essential computer vision tasks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## An image segmentation example"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz\n",
|
||||||
|
"!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz\n",
|
||||||
|
"!tar -xf images.tar.gz\n",
|
||||||
|
"!tar -xf annotations.tar.gz"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"input_dir = \"images/\"\n",
|
||||||
|
"target_dir = \"annotations/trimaps/\"\n",
|
||||||
|
"\n",
|
||||||
|
"input_img_paths = sorted(\n",
|
||||||
|
" [os.path.join(input_dir, fname)\n",
|
||||||
|
" for fname in os.listdir(input_dir)\n",
|
||||||
|
" if fname.endswith(\".jpg\")])\n",
|
||||||
|
"target_paths = sorted(\n",
|
||||||
|
" [os.path.join(target_dir, fname)\n",
|
||||||
|
" for fname in os.listdir(target_dir)\n",
|
||||||
|
" if fname.endswith(\".png\") and not fname.startswith(\".\")])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"from tensorflow.keras.preprocessing.image import load_img, img_to_array\n",
|
||||||
|
"\n",
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(load_img(input_img_paths[9]))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def display_target(target_array):\n",
|
||||||
|
" normalized_array = (target_array.astype(\"uint8\") - 1) * 127\n",
|
||||||
|
" plt.axis(\"off\")\n",
|
||||||
|
" plt.imshow(normalized_array[:, :, 0])\n",
|
||||||
|
"\n",
|
||||||
|
"img = img_to_array(load_img(target_paths[9], color_mode=\"grayscale\"))\n",
|
||||||
|
"display_target(img)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"import random\n",
|
||||||
|
"\n",
|
||||||
|
"img_size = (200, 200)\n",
|
||||||
|
"num_imgs = len(input_img_paths)\n",
|
||||||
|
"\n",
|
||||||
|
"random.Random(1337).shuffle(input_img_paths)\n",
|
||||||
|
"random.Random(1337).shuffle(target_paths)\n",
|
||||||
|
"\n",
|
||||||
|
"def path_to_input_image(path):\n",
|
||||||
|
" return img_to_array(load_img(path, target_size=img_size))\n",
|
||||||
|
"\n",
|
||||||
|
"def path_to_target(path):\n",
|
||||||
|
" img = img_to_array(\n",
|
||||||
|
" load_img(path, target_size=img_size, color_mode=\"grayscale\"))\n",
|
||||||
|
" img = img.astype(\"uint8\") - 1\n",
|
||||||
|
" return img\n",
|
||||||
|
"\n",
|
||||||
|
"input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype=\"float32\")\n",
|
||||||
|
"targets = np.zeros((num_imgs,) + img_size + (1,), dtype=\"uint8\")\n",
|
||||||
|
"for i in range(num_imgs):\n",
|
||||||
|
" input_imgs[i] = path_to_input_image(input_img_paths[i])\n",
|
||||||
|
" targets[i] = path_to_target(target_paths[i])\n",
|
||||||
|
"\n",
|
||||||
|
"num_val_samples = 1000\n",
|
||||||
|
"train_input_imgs = input_imgs[:-num_val_samples]\n",
|
||||||
|
"train_targets = targets[:-num_val_samples]\n",
|
||||||
|
"val_input_imgs = input_imgs[-num_val_samples:]\n",
|
||||||
|
"val_targets = targets[-num_val_samples:]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"def get_model(img_size, num_classes):\n",
|
||||||
|
" inputs = keras.Input(shape=img_size + (3,))\n",
|
||||||
|
" x = layers.experimental.preprocessing.Rescaling(1./255)(inputs)\n",
|
||||||
|
"\n",
|
||||||
|
" x = layers.Conv2D(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2D(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2D(128, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2D(256, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
|
||||||
|
" x = layers.Conv2D(256, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
"\n",
|
||||||
|
" x = layers.Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n",
|
||||||
|
" x = layers.Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n",
|
||||||
|
" x = layers.Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n",
|
||||||
|
"\n",
|
||||||
|
" outputs = layers.Conv2D(3, num_classes, activation=\"softmax\", padding=\"same\")(x)\n",
|
||||||
|
"\n",
|
||||||
|
" model = keras.Model(inputs, outputs)\n",
|
||||||
|
" return model\n",
|
||||||
|
"\n",
|
||||||
|
"model = get_model(img_size=img_size, num_classes=3)\n",
|
||||||
|
"model.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"sparse_categorical_crossentropy\")\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"oxford_segmentation.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"\n",
|
||||||
|
"history = model.fit(train_input_imgs, train_targets,\n",
|
||||||
|
" epochs=50,\n",
|
||||||
|
" callbacks=callbacks,\n",
|
||||||
|
" batch_size=64,\n",
|
||||||
|
" validation_data=(val_input_imgs, val_targets))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"epochs = range(1, len(history.history[\"loss\"]) + 1)\n",
|
||||||
|
"loss = history.history[\"loss\"]\n",
|
||||||
|
"val_loss = history.history[\"val_loss\"]\n",
|
||||||
|
"plt.figure()\n",
|
||||||
|
"plt.plot(epochs, loss, \"bo\", label=\"Training loss\")\n",
|
||||||
|
"plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n",
|
||||||
|
"plt.title(\"Training and validation loss\")\n",
|
||||||
|
"plt.legend()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.preprocessing.image import array_to_img\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.models.load_model(\"oxford_segmentation.keras\")\n",
|
||||||
|
"\n",
|
||||||
|
"i = 4\n",
|
||||||
|
"test_image = val_input_imgs[i]\n",
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(array_to_img(test_image))\n",
|
||||||
|
"\n",
|
||||||
|
"mask = model.predict(np.expand_dims(test_image, 0))[0]\n",
|
||||||
|
"\n",
|
||||||
|
"def display_mask(pred):\n",
|
||||||
|
" mask = np.argmax(pred, axis=-1)\n",
|
||||||
|
" mask *= 127\n",
|
||||||
|
" plt.axis(\"off\")\n",
|
||||||
|
" plt.imshow(mask)\n",
|
||||||
|
"\n",
|
||||||
|
"display_mask(mask)"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter09_part01_image-segmentation.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
236
chapter09_part02_modern-convnet-architecture-patterns.ipynb
Normal file
236
chapter09_part02_modern-convnet-architecture-patterns.ipynb
Normal file
@@ -0,0 +1,236 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Modern convnet architecture patterns"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Modularity, hierarchy, and reuse"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Residual connections"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Case where the target block changes the number of output filters**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"inputs = keras.Input(shape=(32, 32, 3))\n",
|
||||||
|
"x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n",
|
||||||
|
"residual = x\n",
|
||||||
|
"x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
"residual = layers.Conv2D(64, 1)(residual)\n",
|
||||||
|
"x = layers.add([x, residual])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Case where the target block includes a max pooling layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(32, 32, 3))\n",
|
||||||
|
"x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n",
|
||||||
|
"residual = x\n",
|
||||||
|
"x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
"x = layers.MaxPooling2D(2, padding=\"same\")(x)\n",
|
||||||
|
"residual = layers.Conv2D(64, 1, strides=2)(residual)\n",
|
||||||
|
"x = layers.add([x, residual])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(32, 32, 3))\n",
|
||||||
|
"x = layers.experimental.preprocessing.Rescaling(1./255)(inputs)\n",
|
||||||
|
"\n",
|
||||||
|
"def residual_block(x, filters, pooling=False):\n",
|
||||||
|
" residual = x\n",
|
||||||
|
" x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n",
|
||||||
|
" if pooling:\n",
|
||||||
|
" x = layers.MaxPooling2D(2, padding=\"same\")(x)\n",
|
||||||
|
" residual = layers.Conv2D(filters, 1, strides=2)(residual)\n",
|
||||||
|
" elif filters != residual.shape[-1]:\n",
|
||||||
|
" residual = layers.Conv2D(filters, 1)(residual)\n",
|
||||||
|
" x = layers.add([x, residual])\n",
|
||||||
|
" return x\n",
|
||||||
|
"\n",
|
||||||
|
"x = residual_block(x, filters=32, pooling=True)\n",
|
||||||
|
"x = residual_block(x, filters=64, pooling=True)\n",
|
||||||
|
"x = residual_block(x, filters=128, pooling=False)\n",
|
||||||
|
"\n",
|
||||||
|
"x = layers.GlobalAveragePooling2D()(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs=inputs, outputs=outputs)\n",
|
||||||
|
"model.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Batch normalization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Depthwise separable convolutions"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Putting it together: a mini Xception-like model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"data_augmentation = keras.Sequential(\n",
|
||||||
|
" [\n",
|
||||||
|
" layers.experimental.preprocessing.RandomFlip(\"horizontal\"),\n",
|
||||||
|
" layers.experimental.preprocessing.RandomRotation(0.1),\n",
|
||||||
|
" layers.experimental.preprocessing.RandomZoom(0.2),\n",
|
||||||
|
" ]\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(180, 180, 3))\n",
|
||||||
|
"x = data_augmentation(inputs)\n",
|
||||||
|
"\n",
|
||||||
|
"x = layers.experimental.preprocessing.Rescaling(1./255)(x)\n",
|
||||||
|
"x = layers.Conv2D(filters=32, kernel_size=5)(x)\n",
|
||||||
|
"\n",
|
||||||
|
"for size in [32, 64, 128, 256, 512]:\n",
|
||||||
|
" residual = x\n",
|
||||||
|
"\n",
|
||||||
|
" x = layers.BatchNormalization()(x)\n",
|
||||||
|
" x = layers.SeparableConv2D(size, 3, padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Activation(\"relu\")(x)\n",
|
||||||
|
"\n",
|
||||||
|
" x = layers.BatchNormalization()(x)\n",
|
||||||
|
" x = layers.SeparableConv2D(size, 3, padding=\"same\")(x)\n",
|
||||||
|
" x = layers.Activation(\"relu\")(x)\n",
|
||||||
|
"\n",
|
||||||
|
" x = layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)\n",
|
||||||
|
"\n",
|
||||||
|
" residual = layers.Conv2D(\n",
|
||||||
|
" size, 1, strides=2, padding=\"same\")(residual)\n",
|
||||||
|
" x = layers.add([x, residual])\n",
|
||||||
|
"\n",
|
||||||
|
"x = layers.GlobalAveragePooling2D()(x)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs=inputs, outputs=outputs)"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter09_part02_modern-convnet-architecture-patterns.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
785
chapter09_part03_interpreting-what-convnets-learn.ipynb
Normal file
785
chapter09_part03_interpreting-what-convnets-learn.ipynb
Normal file
@@ -0,0 +1,785 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Interpreting what convnets learn"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Visualizing intermediate activations"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# You can use this to load the file \"convnet_from_scratch_with_augmentation.keras\"\n",
|
||||||
|
"# you obtained in the last chapter.\n",
|
||||||
|
"from google.colab import files\n",
|
||||||
|
"files.upload()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"model = keras.models.load_model(\"convnet_from_scratch_with_augmentation.keras\")\n",
|
||||||
|
"model.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preprocessing a single image**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"img_path = keras.utils.get_file(\n",
|
||||||
|
" fname=\"cat.jpg\",\n",
|
||||||
|
" origin=\"https://img-datasets.s3.amazonaws.com/cat.jpg\")\n",
|
||||||
|
"\n",
|
||||||
|
"def get_img_array(img_path, target_size):\n",
|
||||||
|
" img = keras.preprocessing.image.load_img(\n",
|
||||||
|
" img_path, target_size=target_size)\n",
|
||||||
|
" array = keras.preprocessing.image.img_to_array(img)\n",
|
||||||
|
" array = np.expand_dims(array, axis=0)\n",
|
||||||
|
" return array\n",
|
||||||
|
"\n",
|
||||||
|
"img_tensor = get_img_array(img_path, target_size=(180, 180))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Displaying the test picture**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(img_tensor[0].astype(\"uint8\"))\n",
|
||||||
|
"plt.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Instantiating a model that returns layer activations**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"layer_outputs = []\n",
|
||||||
|
"layer_names = []\n",
|
||||||
|
"for layer in model.layers:\n",
|
||||||
|
" if isinstance(layer, (layers.Conv2D, layers.MaxPooling2D)):\n",
|
||||||
|
" layer_outputs.append(layer.output)\n",
|
||||||
|
" layer_names.append(layer.name)\n",
|
||||||
|
"activation_model = keras.Model(inputs=model.input, outputs=layer_outputs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using the model to compute layer activations**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"activations = activation_model.predict(img_tensor)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"first_layer_activation = activations[0]\n",
|
||||||
|
"print(first_layer_activation.shape)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Visualizing the fifth channel**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"plt.matshow(first_layer_activation[0, :, :, 5], cmap=\"viridis\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Visualizing every channel in every intermediate activation**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"images_per_row = 16\n",
|
||||||
|
"for layer_name, layer_activation in zip(layer_names, activations):\n",
|
||||||
|
" n_features = layer_activation.shape[-1]\n",
|
||||||
|
" size = layer_activation.shape[1]\n",
|
||||||
|
" n_cols = n_features // images_per_row\n",
|
||||||
|
" display_grid = np.zeros(((size + 1) * n_cols - 1,\n",
|
||||||
|
" images_per_row * (size + 1) - 1))\n",
|
||||||
|
" for col in range(n_cols):\n",
|
||||||
|
" for row in range(images_per_row):\n",
|
||||||
|
" channel_index = col * images_per_row + row\n",
|
||||||
|
" channel_image = layer_activation[0, :, :, channel_index].copy()\n",
|
||||||
|
" if channel_image.sum() != 0:\n",
|
||||||
|
" channel_image -= channel_image.mean()\n",
|
||||||
|
" channel_image /= channel_image.std()\n",
|
||||||
|
" channel_image *= 64\n",
|
||||||
|
" channel_image += 128\n",
|
||||||
|
" channel_image = np.clip(channel_image, 0, 255).astype(\"uint8\")\n",
|
||||||
|
" display_grid[\n",
|
||||||
|
" col * (size + 1): (col + 1) * size + col,\n",
|
||||||
|
" row * (size + 1) : (row + 1) * size + row] = channel_image\n",
|
||||||
|
" scale = 1. / size\n",
|
||||||
|
" plt.figure(figsize=(scale * display_grid.shape[1],\n",
|
||||||
|
" scale * display_grid.shape[0]))\n",
|
||||||
|
" plt.title(layer_name)\n",
|
||||||
|
" plt.grid(False)\n",
|
||||||
|
" plt.axis(\"off\")\n",
|
||||||
|
" plt.imshow(display_grid, aspect=\"auto\", cmap=\"viridis\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Visualizing convnet filters"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Instantiating the Xception convolutional base**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.applications.xception.Xception(\n",
|
||||||
|
" weights=\"imagenet\",\n",
|
||||||
|
" include_top=False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Printing the names of all convolutional layers in Xception**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for layer in model.layers:\n",
|
||||||
|
" if isinstance(layer, (keras.layers.Conv2D, keras.layers.SeparableConv2D)):\n",
|
||||||
|
" print(layer.name)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Creating a \"feature extractor\" model that returns the output of a specific layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"layer_name = \"block3_sepconv1\"\n",
|
||||||
|
"layer = model.get_layer(name=layer_name)\n",
|
||||||
|
"feature_extractor = keras.Model(inputs=model.input, outputs=layer.output)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Using the feature extractor**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"activation = feature_extractor(\n",
|
||||||
|
" keras.applications.xception.preprocess_input(img_tensor)\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"def compute_loss(image, filter_index):\n",
|
||||||
|
" activation = feature_extractor(image)\n",
|
||||||
|
" filter_activation = activation[:, 2:-2, 2:-2, filter_index]\n",
|
||||||
|
" return tf.reduce_mean(filter_activation)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Loss maximization via stochastic gradient ascent**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"@tf.function\n",
|
||||||
|
"def gradient_ascent_step(image, filter_index, learning_rate):\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" tape.watch(image)\n",
|
||||||
|
" loss = compute_loss(image, filter_index)\n",
|
||||||
|
" grads = tape.gradient(loss, image)\n",
|
||||||
|
" grads = tf.math.l2_normalize(grads)\n",
|
||||||
|
" image += learning_rate * grads\n",
|
||||||
|
" return image"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Function to generate filter visualizations**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"img_width = 200\n",
|
||||||
|
"img_height = 200\n",
|
||||||
|
"\n",
|
||||||
|
"def generate_filter_pattern(filter_index):\n",
|
||||||
|
" iterations = 30\n",
|
||||||
|
" learning_rate = 10.\n",
|
||||||
|
" image = tf.random.uniform(\n",
|
||||||
|
" minval=0.4,\n",
|
||||||
|
" maxval=0.6,\n",
|
||||||
|
" shape=(1, img_width, img_height, 3))\n",
|
||||||
|
" for i in range(iterations):\n",
|
||||||
|
" image = gradient_ascent_step(image, filter_index, learning_rate)\n",
|
||||||
|
" return image[0].numpy()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Utility function to convert a tensor into a valid image**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def deprocess_image(image):\n",
|
||||||
|
" image -= image.mean()\n",
|
||||||
|
" image /= image.std()\n",
|
||||||
|
" image *= 64\n",
|
||||||
|
" image += 128\n",
|
||||||
|
" image = np.clip(image, 0, 255).astype(\"uint8\")\n",
|
||||||
|
" image = image[25:-25, 25:-25, :]\n",
|
||||||
|
" return image"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(deprocess_image(generate_filter_pattern(filter_index=2)))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Generating a grid of all filter response patterns in a layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"all_images = []\n",
|
||||||
|
"for filter_index in range(64):\n",
|
||||||
|
" print(f\"Processing filter {filter_index}\")\n",
|
||||||
|
" image = deprocess_image(\n",
|
||||||
|
" generate_filter_pattern(filter_index)\n",
|
||||||
|
" )\n",
|
||||||
|
" all_images.append(image)\n",
|
||||||
|
"\n",
|
||||||
|
"margin = 5\n",
|
||||||
|
"n = 8\n",
|
||||||
|
"cropped_width = img_width - 25 * 2\n",
|
||||||
|
"cropped_height = img_height - 25 * 2\n",
|
||||||
|
"width = n * cropped_width + (n - 1) * margin\n",
|
||||||
|
"height = n * cropped_height + (n - 1) * margin\n",
|
||||||
|
"stitched_filters = np.zeros((width, height, 3))\n",
|
||||||
|
"\n",
|
||||||
|
"for i in range(n):\n",
|
||||||
|
" for j in range(n):\n",
|
||||||
|
" image = all_images[i * n + j]\n",
|
||||||
|
" stitched_filters[\n",
|
||||||
|
" (cropped_width + margin) * i : (cropped_width + margin) * i + cropped_width,\n",
|
||||||
|
" (cropped_height + margin) * j : (cropped_height + margin) * j\n",
|
||||||
|
" + cropped_height,\n",
|
||||||
|
" :,\n",
|
||||||
|
" ] = image\n",
|
||||||
|
"\n",
|
||||||
|
"keras.preprocessing.image.save_img(\n",
|
||||||
|
" f\"filters_for_layer_{layer_name}.png\", stitched_filters)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Visualizing heatmaps of class activation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Loading the Xception network with pretrained weights**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.applications.xception.Xception(weights=\"imagenet\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preprocessing an input image for Xception**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"img_path = keras.utils.get_file(\n",
|
||||||
|
" fname=\"elephant.jpg\",\n",
|
||||||
|
" origin=\"https://img-datasets.s3.amazonaws.com/elephant.jpg\")\n",
|
||||||
|
"\n",
|
||||||
|
"def get_img_array(img_path, target_size):\n",
|
||||||
|
" img = keras.preprocessing.image.load_img(img_path, target_size=target_size)\n",
|
||||||
|
" array = keras.preprocessing.image.img_to_array(img)\n",
|
||||||
|
" array = np.expand_dims(array, axis=0)\n",
|
||||||
|
" array = keras.applications.xception.preprocess_input(array)\n",
|
||||||
|
" return array\n",
|
||||||
|
"\n",
|
||||||
|
"img_array = get_img_array(img_path, target_size=(299, 299))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"preds = model.predict(img_array)\n",
|
||||||
|
"print(keras.applications.xception.decode_predictions(preds, top=3)[0])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"np.argmax(preds[0])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Setting up a model that returns the last convolutional output**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"last_conv_layer_name = \"block14_sepconv2_act\"\n",
|
||||||
|
"classifier_layer_names = [\n",
|
||||||
|
" \"avg_pool\",\n",
|
||||||
|
" \"predictions\",\n",
|
||||||
|
"]\n",
|
||||||
|
"last_conv_layer = model.get_layer(last_conv_layer_name)\n",
|
||||||
|
"last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Setting up a model that goes from the last convolutional output to the final predictions**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])\n",
|
||||||
|
"x = classifier_input\n",
|
||||||
|
"for layer_name in classifier_layer_names:\n",
|
||||||
|
" x = model.get_layer(layer_name)(x)\n",
|
||||||
|
"classifier_model = keras.Model(classifier_input, x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Retrieving the gradients of the top predicted class with regard to the last convolutional output**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"with tf.GradientTape() as tape:\n",
|
||||||
|
" last_conv_layer_output = last_conv_layer_model(img_array)\n",
|
||||||
|
" tape.watch(last_conv_layer_output)\n",
|
||||||
|
" preds = classifier_model(last_conv_layer_output)\n",
|
||||||
|
" top_pred_index = tf.argmax(preds[0])\n",
|
||||||
|
" top_class_channel = preds[:, top_pred_index]\n",
|
||||||
|
"\n",
|
||||||
|
"grads = tape.gradient(top_class_channel, last_conv_layer_output)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Gradient pooling and channel importance weighting**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2)).numpy()\n",
|
||||||
|
"last_conv_layer_output = last_conv_layer_output.numpy()[0]\n",
|
||||||
|
"for i in range(pooled_grads.shape[-1]):\n",
|
||||||
|
" last_conv_layer_output[:, :, i] *= pooled_grads[i]\n",
|
||||||
|
"heatmap = np.mean(last_conv_layer_output, axis=-1)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Heatmap post-processing**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"heatmap = np.maximum(heatmap, 0)\n",
|
||||||
|
"heatmap /= np.max(heatmap)\n",
|
||||||
|
"plt.matshow(heatmap)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Superimposing the heatmap with the original picture**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.cm as cm\n",
|
||||||
|
"\n",
|
||||||
|
"img = keras.preprocessing.image.load_img(img_path)\n",
|
||||||
|
"img = keras.preprocessing.image.img_to_array(img)\n",
|
||||||
|
"\n",
|
||||||
|
"heatmap = np.uint8(255 * heatmap)\n",
|
||||||
|
"\n",
|
||||||
|
"jet = cm.get_cmap(\"jet\")\n",
|
||||||
|
"jet_colors = jet(np.arange(256))[:, :3]\n",
|
||||||
|
"jet_heatmap = jet_colors[heatmap]\n",
|
||||||
|
"\n",
|
||||||
|
"jet_heatmap = keras.preprocessing.image.array_to_img(jet_heatmap)\n",
|
||||||
|
"jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))\n",
|
||||||
|
"jet_heatmap = keras.preprocessing.image.img_to_array(jet_heatmap)\n",
|
||||||
|
"\n",
|
||||||
|
"superimposed_img = jet_heatmap * 0.4 + img\n",
|
||||||
|
"superimposed_img = keras.preprocessing.image.array_to_img(superimposed_img)\n",
|
||||||
|
"\n",
|
||||||
|
"save_path = \"elephant_cam.jpg\"\n",
|
||||||
|
"superimposed_img.save(save_path)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter09_part03_interpreting-what-convnets-learn.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
845
chapter10_dl-for-timeseries.ipynb
Normal file
845
chapter10_dl-for-timeseries.ipynb
Normal file
@@ -0,0 +1,845 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Deep learning for timeseries"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Different kinds of timeseries tasks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## A temperature forecasting example"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip\n",
|
||||||
|
"!unzip jena_climate_2009_2016.csv.zip"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Inspecting the data of the Jena weather dataset**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"fname = os.path.join(\"jena_climate_2009_2016.csv\")\n",
|
||||||
|
"\n",
|
||||||
|
"with open(fname) as f:\n",
|
||||||
|
" data = f.read()\n",
|
||||||
|
"\n",
|
||||||
|
"lines = data.split(\"\\n\")\n",
|
||||||
|
"header = lines[0].split(\",\")\n",
|
||||||
|
"lines = lines[1:]\n",
|
||||||
|
"print(header)\n",
|
||||||
|
"print(len(lines))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Parsing the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"temperature = np.zeros((len(lines),))\n",
|
||||||
|
"raw_data = np.zeros((len(lines), len(header) - 1))\n",
|
||||||
|
"for i, line in enumerate(lines):\n",
|
||||||
|
" values = [float(x) for x in line.split(\",\")[1:]]\n",
|
||||||
|
" temperature[i] = values[1]\n",
|
||||||
|
" raw_data[i, :] = values[:]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Plotting the temperature timeseries**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from matplotlib import pyplot as plt\n",
|
||||||
|
"plt.plot(range(len(temperature)), temperature)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Plotting the first 10 days of the temperature timeseries**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"plt.plot(range(1440), temperature[:1440])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Computing the number of samples we'll use for each data split.**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"num_train_samples = int(0.5 * len(raw_data))\n",
|
||||||
|
"num_val_samples = int(0.25 * len(raw_data))\n",
|
||||||
|
"num_test_samples = len(raw_data) - num_train_samples - num_val_samples\n",
|
||||||
|
"print(\"num_train_samples:\", num_train_samples)\n",
|
||||||
|
"print(\"num_val_samples:\", num_val_samples)\n",
|
||||||
|
"print(\"num_test_samples:\", num_test_samples)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Preparing the data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Normalizing the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"mean = raw_data[:num_train_samples].mean(axis=0)\n",
|
||||||
|
"raw_data -= mean\n",
|
||||||
|
"std = raw_data[:num_train_samples].std(axis=0)\n",
|
||||||
|
"raw_data /= std"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"int_sequence = np.arange(10)\n",
|
||||||
|
"dummy_dataset = keras.preprocessing.timeseries_dataset_from_array(\n",
|
||||||
|
" data=int_sequence[:-3],\n",
|
||||||
|
" targets=int_sequence[3:],\n",
|
||||||
|
" sequence_length=3,\n",
|
||||||
|
" batch_size=2,\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"for inputs, targets in dummy_dataset:\n",
|
||||||
|
" for i in range(inputs.shape[0]):\n",
|
||||||
|
" print([int(x) for x in inputs[i]], int(targets[i]))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Instantiating Datasets for training, validation, and testing.**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"sampling_rate = 6\n",
|
||||||
|
"sequence_length = 120\n",
|
||||||
|
"delay = sampling_rate * (sequence_length + 24 - 1)\n",
|
||||||
|
"batch_size = 256\n",
|
||||||
|
"\n",
|
||||||
|
"train_dataset = keras.preprocessing.timeseries_dataset_from_array(\n",
|
||||||
|
" raw_data[:-delay],\n",
|
||||||
|
" targets=temperature[delay:],\n",
|
||||||
|
" sampling_rate=sampling_rate,\n",
|
||||||
|
" sequence_length=sequence_length,\n",
|
||||||
|
" shuffle=True,\n",
|
||||||
|
" batch_size=batch_size,\n",
|
||||||
|
" start_index=0,\n",
|
||||||
|
" end_index=num_train_samples)\n",
|
||||||
|
"\n",
|
||||||
|
"val_dataset = keras.preprocessing.timeseries_dataset_from_array(\n",
|
||||||
|
" raw_data[:-delay],\n",
|
||||||
|
" targets=temperature[delay:],\n",
|
||||||
|
" sampling_rate=sampling_rate,\n",
|
||||||
|
" sequence_length=sequence_length,\n",
|
||||||
|
" shuffle=True,\n",
|
||||||
|
" batch_size=batch_size,\n",
|
||||||
|
" start_index=num_train_samples,\n",
|
||||||
|
" end_index=num_train_samples + num_val_samples)\n",
|
||||||
|
"\n",
|
||||||
|
"test_dataset = keras.preprocessing.timeseries_dataset_from_array(\n",
|
||||||
|
" raw_data[:-delay],\n",
|
||||||
|
" targets=temperature[delay:],\n",
|
||||||
|
" sampling_rate=sampling_rate,\n",
|
||||||
|
" sequence_length=sequence_length,\n",
|
||||||
|
" shuffle=True,\n",
|
||||||
|
" batch_size=batch_size,\n",
|
||||||
|
" start_index=num_train_samples + num_val_samples)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Inspecting the output of one of our Datasets.**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for samples, targets in train_dataset:\n",
|
||||||
|
" print(\"samples shape:\", samples.shape)\n",
|
||||||
|
" print(\"targets shape:\", targets.shape)\n",
|
||||||
|
" break"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A common-sense, non-machine-learning baseline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Computing the common-sense baseline MAE**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def evaluate_naive_method(dataset):\n",
|
||||||
|
" total_abs_err = 0.\n",
|
||||||
|
" samples_seen = 0\n",
|
||||||
|
" for samples, targets in dataset:\n",
|
||||||
|
" preds = samples[:, -1, 1] * std[1] + mean[1]\n",
|
||||||
|
" total_abs_err += np.sum(np.abs(preds - targets))\n",
|
||||||
|
" samples_seen += samples.shape[0]\n",
|
||||||
|
" return total_abs_err / samples_seen\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Validation MAE: {evaluate_naive_method(val_dataset):.2f}\")\n",
|
||||||
|
"print(f\"Test MAE: {evaluate_naive_method(test_dataset):.2f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Let's try a basic machine learning model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and evaluating a densely connected model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.Flatten()(inputs)\n",
|
||||||
|
"x = layers.Dense(16, activation=\"relu\")(x)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"jena_dense.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" validation_data=val_dataset,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.models.load_model(\"jena_dense.keras\")\n",
|
||||||
|
"print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Plotting results**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"loss = history.history[\"mae\"]\n",
|
||||||
|
"val_loss = history.history[\"val_mae\"]\n",
|
||||||
|
"epochs = range(1, len(loss) + 1)\n",
|
||||||
|
"plt.figure()\n",
|
||||||
|
"plt.plot(epochs, loss, \"bo\", label=\"Training MAE\")\n",
|
||||||
|
"plt.plot(epochs, val_loss, \"b\", label=\"Validation MAE\")\n",
|
||||||
|
"plt.title(\"Training and validation MAE\")\n",
|
||||||
|
"plt.legend()\n",
|
||||||
|
"plt.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Let's try a 1D convolutional model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.Conv1D(8, 24, activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.MaxPooling1D(2)(x)\n",
|
||||||
|
"x = layers.Conv1D(8, 12, activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.MaxPooling1D(2)(x)\n",
|
||||||
|
"x = layers.Conv1D(8, 6, activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.GlobalAveragePooling1D()(x)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"jena_conv.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" validation_data=val_dataset,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.models.load_model(\"jena_conv.keras\")\n",
|
||||||
|
"print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A first recurrent baseline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A simple LSTM-based model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.LSTM(16)(inputs)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"jena_lstm.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" validation_data=val_dataset,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"\n",
|
||||||
|
"model = keras.models.load_model(\"jena_lstm.keras\")\n",
|
||||||
|
"print(\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Understanding recurrent neural networks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**NumPy implementation of a simple RNN**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"timesteps = 100\n",
|
||||||
|
"input_features = 32\n",
|
||||||
|
"output_features = 64\n",
|
||||||
|
"inputs = np.random.random((timesteps, input_features))\n",
|
||||||
|
"state_t = np.zeros((output_features,))\n",
|
||||||
|
"W = np.random.random((output_features, input_features))\n",
|
||||||
|
"U = np.random.random((output_features, output_features))\n",
|
||||||
|
"b = np.random.random((output_features,))\n",
|
||||||
|
"successive_outputs = []\n",
|
||||||
|
"for input_t in inputs:\n",
|
||||||
|
" output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)\n",
|
||||||
|
" successive_outputs.append(output_t)\n",
|
||||||
|
" state_t = output_t\n",
|
||||||
|
"final_output_sequence = np.concatenate(successive_outputs, axis=0)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A recurrent layer in Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A RNN layer that can process sequences of any length**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"num_features = 14\n",
|
||||||
|
"inputs = keras.Input(shape=(None, num_features))\n",
|
||||||
|
"outputs = layers.SimpleRNN(16)(inputs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A RNN layer that returns only its last output step**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"num_features = 14\n",
|
||||||
|
"steps = 120\n",
|
||||||
|
"inputs = keras.Input(shape=(steps, num_features))\n",
|
||||||
|
"outputs = layers.SimpleRNN(16, return_sequences=False)(inputs)\n",
|
||||||
|
"print(outputs.shape)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A RNN layer that returns its full output sequence**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"num_features = 14\n",
|
||||||
|
"steps = 120\n",
|
||||||
|
"inputs = keras.Input(shape=(steps, num_features))\n",
|
||||||
|
"outputs = layers.SimpleRNN(16, return_sequences=True)(inputs)\n",
|
||||||
|
"print(outputs.shape)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Stacking RNN layers**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(steps, num_features))\n",
|
||||||
|
"x = layers.SimpleRNN(16, return_sequences=True)(inputs)\n",
|
||||||
|
"x = layers.SimpleRNN(16, return_sequences=True)(x)\n",
|
||||||
|
"outputs = layers.SimpleRNN(16)(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Advanced use of recurrent neural networks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Using recurrent dropout to fight overfitting"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and evaluating a dropout-regularized LSTM**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"jena_lstm_dropout.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=50,\n",
|
||||||
|
" validation_data=val_dataset,\n",
|
||||||
|
" callbacks=callbacks)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, num_features))\n",
|
||||||
|
"x = layers.LSTM(32, recurrent_dropout=0.2, unroll=True)(inputs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Stacking recurrent layers"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and evaluating a dropout-regularized, stacked GRU model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.GRU(32, recurrent_dropout=0.5, return_sequences=True)(inputs)\n",
|
||||||
|
"x = layers.GRU(32, recurrent_dropout=0.5)(x)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"jena_stacked_gru_dropout.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=50,\n",
|
||||||
|
" validation_data=val_dataset,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"jena_stacked_gru_dropout.keras\")\n",
|
||||||
|
"print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Using bidirectional RNNs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and evaluating a bidirectional LSTM**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n",
|
||||||
|
"x = layers.Bidirectional(layers.LSTM(16))(inputs)\n",
|
||||||
|
"outputs = layers.Dense(1)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n",
|
||||||
|
"history = model.fit(train_dataset,\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" validation_data=val_dataset)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### *_Going even further_*"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter10_dl-for-timeseries.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
736
chapter11_part01_introduction.ipynb
Normal file
736
chapter11_part01_introduction.ipynb
Normal file
@@ -0,0 +1,736 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Deep learning for text"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Natural Language Processing: the bird's eye view"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Preparing text data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Text standardization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Text splitting (tokenization)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Vocabulary indexing"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Using the `TextVectorization` layer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import string\n",
|
||||||
|
"\n",
|
||||||
|
"class Vectorizer:\n",
|
||||||
|
" def standardize(self, text):\n",
|
||||||
|
" text = text.lower()\n",
|
||||||
|
" return \"\".join(char for char in text if char not in string.punctuation)\n",
|
||||||
|
"\n",
|
||||||
|
" def tokenize(self, text):\n",
|
||||||
|
" text = self.standardize(text)\n",
|
||||||
|
" return text.split()\n",
|
||||||
|
"\n",
|
||||||
|
" def make_vocabulary(self, dataset):\n",
|
||||||
|
" self.vocabulary = {\"\": 0, \"[UNK]\": 1}\n",
|
||||||
|
" for text in dataset:\n",
|
||||||
|
" text = self.standardize(text)\n",
|
||||||
|
" tokens = self.tokenize(text)\n",
|
||||||
|
" for token in tokens:\n",
|
||||||
|
" if token not in self.vocabulary:\n",
|
||||||
|
" self.vocabulary[token] = len(self.vocabulary)\n",
|
||||||
|
" self.inverse_vocabulary = dict(\n",
|
||||||
|
" (v, k) for k, v in self.vocabulary.items())\n",
|
||||||
|
"\n",
|
||||||
|
" def encode(self, text):\n",
|
||||||
|
" text = self.standardize(text)\n",
|
||||||
|
" tokens = self.tokenize(text)\n",
|
||||||
|
" return [self.vocabulary.get(token, 1) for token in tokens]\n",
|
||||||
|
"\n",
|
||||||
|
" def decode(self, int_sequence):\n",
|
||||||
|
" return \" \".join(\n",
|
||||||
|
" self.inverse_vocabulary.get(i, \"[UNK]\") for i in int_sequence)\n",
|
||||||
|
"\n",
|
||||||
|
"vectorizer = Vectorizer()\n",
|
||||||
|
"dataset = [\n",
|
||||||
|
" \"I write, erase, rewrite\",\n",
|
||||||
|
" \"Erase again, and then\",\n",
|
||||||
|
" \"A poppy blooms.\",\n",
|
||||||
|
"]\n",
|
||||||
|
"vectorizer.make_vocabulary(dataset)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"test_sentence = \"I write, rewrite, and still rewrite again\"\n",
|
||||||
|
"encoded_sentence = vectorizer.encode(test_sentence)\n",
|
||||||
|
"print(encoded_sentence)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"decoded_sentence = vectorizer.decode(encoded_sentence)\n",
|
||||||
|
"print(decoded_sentence)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import re\n",
|
||||||
|
"import string\n",
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"def custom_standardization_fn(string_tensor):\n",
|
||||||
|
" lowercase_string = tf.strings.lower(string_tensor)\n",
|
||||||
|
" return tf.strings.regex_replace(\n",
|
||||||
|
" lowercase_string, f\"[{re.escape(string.punctuation)}]\", \"\")\n",
|
||||||
|
"\n",
|
||||||
|
"def custom_split_fn(string_tensor):\n",
|
||||||
|
" return tf.strings.split(string_tensor)\n",
|
||||||
|
"\n",
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" standardize=custom_standardization_fn,\n",
|
||||||
|
" split=custom_split_fn,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"dataset = [\n",
|
||||||
|
" \"I write, erase, rewrite\",\n",
|
||||||
|
" \"Erase again, and then\",\n",
|
||||||
|
" \"A poppy blooms.\",\n",
|
||||||
|
"]\n",
|
||||||
|
"text_vectorization.adapt(dataset)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Displaying the vocabulary**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization.get_vocabulary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"vocabulary = text_vectorization.get_vocabulary()\n",
|
||||||
|
"test_sentence = \"I write, rewrite, and still rewrite again\"\n",
|
||||||
|
"encoded_sentence = text_vectorization(test_sentence)\n",
|
||||||
|
"print(encoded_sentence)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inverse_vocab = dict(enumerate(vocabulary))\n",
|
||||||
|
"decoded_sentence = \" \".join(inverse_vocab[int(i)] for i in encoded_sentence)\n",
|
||||||
|
"print(decoded_sentence)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Two approaches for representing groups of words: sets and sequences"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Preparing the IMDB movie reviews data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n",
|
||||||
|
"!tar -xf aclImdb_v1.tar.gz"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!rm -r aclImdb/train/unsup"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!cat aclImdb/train/pos/4077_10.txt"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os, pathlib, shutil, random\n",
|
||||||
|
"\n",
|
||||||
|
"base_dir = pathlib.Path(\"aclImdb\")\n",
|
||||||
|
"val_dir = base_dir / \"val\"\n",
|
||||||
|
"train_dir = base_dir / \"train\"\n",
|
||||||
|
"for category in (\"neg\", \"pos\"):\n",
|
||||||
|
" os.makedirs(val_dir / category)\n",
|
||||||
|
" files = os.listdir(train_dir / category)\n",
|
||||||
|
" random.Random(1337).shuffle(files)\n",
|
||||||
|
" num_val_samples = int(0.2 * len(files))\n",
|
||||||
|
" val_files = files[-num_val_samples:]\n",
|
||||||
|
" for fname in val_files:\n",
|
||||||
|
" shutil.move(train_dir / category / fname,\n",
|
||||||
|
" val_dir / category / fname)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"batch_size = 32\n",
|
||||||
|
"\n",
|
||||||
|
"train_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/train\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"val_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/val\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"test_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/test\", batch_size=batch_size\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Displaying the shapes and dtypes of the first batch**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for inputs, targets in train_ds:\n",
|
||||||
|
" print(\"inputs.shape:\", inputs.shape)\n",
|
||||||
|
" print(\"inputs.dtype:\", inputs.dtype)\n",
|
||||||
|
" print(\"targets.shape:\", targets.shape)\n",
|
||||||
|
" print(\"targets.dtype:\", targets.dtype)\n",
|
||||||
|
" print(\"inputs[0]:\", inputs[0])\n",
|
||||||
|
" print(\"targets[0]:\", targets[0])\n",
|
||||||
|
" break"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Processing words as a set: the bag-of-words approach"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Single words (unigrams) with binary encoding"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preprocessing our datasets with a `TextVectorization` layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=20000,\n",
|
||||||
|
" output_mode=\"binary\",\n",
|
||||||
|
")\n",
|
||||||
|
"text_only_train_ds = train_ds.map(lambda x, y: x)\n",
|
||||||
|
"text_vectorization.adapt(text_only_train_ds)\n",
|
||||||
|
"\n",
|
||||||
|
"binary_1gram_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"binary_1gram_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"binary_1gram_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Inspecting the output of our binary unigram dataset**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for inputs, targets in binary_1gram_train_ds:\n",
|
||||||
|
" print(\"inputs.shape:\", inputs.shape)\n",
|
||||||
|
" print(\"inputs.dtype:\", inputs.dtype)\n",
|
||||||
|
" print(\"targets.shape:\", targets.shape)\n",
|
||||||
|
" print(\"targets.dtype:\", targets.dtype)\n",
|
||||||
|
" print(\"inputs[0]:\", inputs[0])\n",
|
||||||
|
" print(\"targets[0]:\", targets[0])\n",
|
||||||
|
" break"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Our model-building utility**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"def get_model(max_tokens=20000, hidden_dim=16):\n",
|
||||||
|
" inputs = keras.Input(shape=(max_tokens,))\n",
|
||||||
|
" x = layers.Dense(hidden_dim, activation=\"relu\")(inputs)\n",
|
||||||
|
" x = layers.Dropout(0.5)(x)\n",
|
||||||
|
" outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
" model = keras.Model(inputs, outputs)\n",
|
||||||
|
" model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
" return model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and testing the binary unigram model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = get_model()\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"binary_1gram.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(binary_1gram_train_ds.cache(),\n",
|
||||||
|
" validation_data=binary_1gram_val_ds.cache(),\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"binary_1gram.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(binary_1gram_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Bigrams with binary encoding"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Configuring the `TextVectorization` layer to return bigrams**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" ngrams=2,\n",
|
||||||
|
" max_tokens=20000,\n",
|
||||||
|
" output_mode=\"binary\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and testing the binary bigram model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization.adapt(text_only_train_ds)\n",
|
||||||
|
"binary_2gram_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"binary_2gram_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"binary_2gram_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"\n",
|
||||||
|
"model = get_model()\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"binary_2gram.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(binary_2gram_train_ds.cache(),\n",
|
||||||
|
" validation_data=binary_2gram_val_ds.cache(),\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"binary_2gram.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(binary_2gram_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Bigrams with TF-IDF encoding"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Configuring the `TextVectorization` layer to return token counts**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" ngrams=2,\n",
|
||||||
|
" max_tokens=20000,\n",
|
||||||
|
" output_mode=\"count\"\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Configuring the `TextVectorization` layer to return TF-IDF-weighted outputs**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" ngrams=2,\n",
|
||||||
|
" max_tokens=20000,\n",
|
||||||
|
" output_mode=\"tf-idf\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and testing the TF-IDF bigram model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_vectorization.adapt(text_only_train_ds)\n",
|
||||||
|
"\n",
|
||||||
|
"tfidf_2gram_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"tfidf_2gram_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"tfidf_2gram_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"\n",
|
||||||
|
"model = get_model()\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"tfidf_2gram.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(tfidf_2gram_train_ds.cache(),\n",
|
||||||
|
" validation_data=tfidf_2gram_val_ds.cache(),\n",
|
||||||
|
" epochs=10,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"tfidf_2gram.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(tfidf_2gram_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(1,), dtype=\"string\")\n",
|
||||||
|
"processed_inputs = text_vectorization(inputs)\n",
|
||||||
|
"outputs = model(processed_inputs)\n",
|
||||||
|
"inference_model = keras.Model(inputs, outputs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"raw_text_data = tf.convert_to_tensor([\n",
|
||||||
|
" [\"That was an excellent movie, I loved it.\"],\n",
|
||||||
|
"])\n",
|
||||||
|
"predictions = inference_model(raw_text_data)\n",
|
||||||
|
"print(f\"{float(predictions[0] * 100):.2f} percent positive\")"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter11_part01_introduction.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
499
chapter11_part02_sequence-models.ipynb
Normal file
499
chapter11_part02_sequence-models.ipynb
Normal file
@@ -0,0 +1,499 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Processing words as a sequence: the Sequence Model approach"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### A first practical example"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Downloading the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n",
|
||||||
|
"!tar -xf aclImdb_v1.tar.gz\n",
|
||||||
|
"!rm -r aclImdb/train/unsup"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os, pathlib, shutil, random\n",
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"batch_size = 32\n",
|
||||||
|
"base_dir = pathlib.Path(\"aclImdb\")\n",
|
||||||
|
"val_dir = base_dir / \"val\"\n",
|
||||||
|
"train_dir = base_dir / \"train\"\n",
|
||||||
|
"for category in (\"neg\", \"pos\"):\n",
|
||||||
|
" os.makedirs(val_dir / category)\n",
|
||||||
|
" files = os.listdir(train_dir / category)\n",
|
||||||
|
" random.Random(1337).shuffle(files)\n",
|
||||||
|
" num_val_samples = int(0.2 * len(files))\n",
|
||||||
|
" val_files = files[-num_val_samples:]\n",
|
||||||
|
" for fname in val_files:\n",
|
||||||
|
" shutil.move(train_dir / category / fname,\n",
|
||||||
|
" val_dir / category / fname)\n",
|
||||||
|
"\n",
|
||||||
|
"train_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/train\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"val_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/val\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"test_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/test\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"text_only_train_ds = train_ds.map(lambda x, y: x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing integer sequence datasets**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
|
||||||
|
"max_length = 600\n",
|
||||||
|
"max_tokens = 20000\n",
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=max_tokens,\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" output_sequence_length=max_length,\n",
|
||||||
|
")\n",
|
||||||
|
"text_vectorization.adapt(text_only_train_ds)\n",
|
||||||
|
"\n",
|
||||||
|
"int_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"int_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"int_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A sequence model built on top of one-hot encoded vector sequences**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"embedded = tf.one_hot(inputs, depth=max_tokens)\n",
|
||||||
|
"x = layers.Bidirectional(layers.LSTM(32))(embedded)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training a first basic sequence model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"one_hot_bidir_lstm.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"one_hot_bidir_lstm.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Understanding word embeddings"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"##### Learning word embeddings with the `Embedding` layer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Instantiating an `Embedding` layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"embedding_layer = layers.Embedding(input_dim=max_tokens, output_dim=256)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Model that uses an Embedding layer trained from scratch**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"embedded = layers.Embedding(input_dim=max_tokens, output_dim=256)(inputs)\n",
|
||||||
|
"x = layers.Bidirectional(layers.LSTM(32))(embedded)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"embeddings_bidir_gru.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"###### Understanding padding & masking"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Model that uses an Embedding layer trained from scratch, with masking enabled**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"embedded = layers.Embedding(\n",
|
||||||
|
" input_dim=max_tokens, output_dim=256, mask_zero=True)(inputs)\n",
|
||||||
|
"x = layers.Bidirectional(layers.LSTM(32))(embedded)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru_with_masking.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"embeddings_bidir_gru_with_masking.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"##### Using pretrained word embeddings"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"###### Downloading the GloVe word embeddings"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!wget http://nlp.stanford.edu/data/glove.6B.zip\n",
|
||||||
|
"!unzip -q glove.6B.zip"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Parsing the GloVe word-embeddings file**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"path_to_glove_file = \"glove.6B.100d.txt\"\n",
|
||||||
|
"\n",
|
||||||
|
"embeddings_index = {}\n",
|
||||||
|
"with open(path_to_glove_file) as f:\n",
|
||||||
|
" for line in f:\n",
|
||||||
|
" word, coefs = line.split(maxsplit=1)\n",
|
||||||
|
" coefs = np.fromstring(coefs, \"f\", sep=\" \")\n",
|
||||||
|
" embeddings_index[word] = coefs\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Found {len(embeddings_index)} word vectors.\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"###### Loading the GloVe embeddings in the model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing the GloVe word-embeddings matrix**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"embedding_dim = 100\n",
|
||||||
|
"\n",
|
||||||
|
"vocabulary = text_vectorization.get_vocabulary()\n",
|
||||||
|
"word_index = dict(zip(vocabulary, range(len(vocabulary))))\n",
|
||||||
|
"\n",
|
||||||
|
"embedding_matrix = np.zeros((max_tokens, embedding_dim))\n",
|
||||||
|
"for word, i in word_index.items():\n",
|
||||||
|
" if i < max_tokens:\n",
|
||||||
|
" embedding_vector = embeddings_index.get(word)\n",
|
||||||
|
" if embedding_vector is not None:\n",
|
||||||
|
" embedding_matrix[i] = embedding_vector"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"embedding_layer = layers.Embedding(\n",
|
||||||
|
" max_tokens,\n",
|
||||||
|
" embedding_dim,\n",
|
||||||
|
" embeddings_initializer=keras.initializers.Constant(embedding_matrix),\n",
|
||||||
|
" trainable=False,\n",
|
||||||
|
" mask_zero=True,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"###### Training a simple bidirectional LSTM on top of the GloVe embeddings"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Model that uses a pretrained Embedding layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"embedded = embedding_layer(inputs)\n",
|
||||||
|
"x = layers.Bidirectional(layers.LSTM(32))(embedded)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"glove_embeddings_sequence_model.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\"glove_embeddings_sequence_model.keras\")\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter11_part02_sequence-models.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
425
chapter11_part03_transformer.ipynb
Normal file
425
chapter11_part03_transformer.ipynb
Normal file
@@ -0,0 +1,425 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## The Transformer architecture"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Understanding self-attention"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Generalized self-attention: the query-key-value model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Multi-Head attention"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The Transformer encoder"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Getting the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n",
|
||||||
|
"!tar -xf aclImdb_v1.tar.gz\n",
|
||||||
|
"!rm -r aclImdb/train/unsup"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os, pathlib, shutil, random\n",
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"batch_size = 32\n",
|
||||||
|
"base_dir = pathlib.Path(\"aclImdb\")\n",
|
||||||
|
"val_dir = base_dir / \"val\"\n",
|
||||||
|
"train_dir = base_dir / \"train\"\n",
|
||||||
|
"for category in (\"neg\", \"pos\"):\n",
|
||||||
|
" os.makedirs(val_dir / category)\n",
|
||||||
|
" files = os.listdir(train_dir / category)\n",
|
||||||
|
" random.Random(1337).shuffle(files)\n",
|
||||||
|
" num_val_samples = int(0.2 * len(files))\n",
|
||||||
|
" val_files = files[-num_val_samples:]\n",
|
||||||
|
" for fname in val_files:\n",
|
||||||
|
" shutil.move(train_dir / category / fname,\n",
|
||||||
|
" val_dir / category / fname)\n",
|
||||||
|
"\n",
|
||||||
|
"train_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/train\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"val_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/val\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"test_ds = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" \"aclImdb/test\", batch_size=batch_size\n",
|
||||||
|
")\n",
|
||||||
|
"text_only_train_ds = train_ds.map(lambda x, y: x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Vectorizing the data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
|
||||||
|
"max_length = 600\n",
|
||||||
|
"max_tokens = 20000\n",
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=max_tokens,\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" output_sequence_length=max_length,\n",
|
||||||
|
")\n",
|
||||||
|
"text_vectorization.adapt(text_only_train_ds)\n",
|
||||||
|
"\n",
|
||||||
|
"int_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"int_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))\n",
|
||||||
|
"int_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Transformer encoder implemented as a subclassed Layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"class TransformerEncoder(layers.Layer):\n",
|
||||||
|
" def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.embed_dim = embed_dim\n",
|
||||||
|
" self.dense_dim = dense_dim\n",
|
||||||
|
" self.num_heads = num_heads\n",
|
||||||
|
" self.attention = layers.MultiHeadAttention(\n",
|
||||||
|
" num_heads=num_heads, key_dim=embed_dim)\n",
|
||||||
|
" self.dense_proj = keras.Sequential(\n",
|
||||||
|
" [layers.Dense(dense_dim, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(embed_dim),]\n",
|
||||||
|
" )\n",
|
||||||
|
" self.layernorm_1 = layers.LayerNormalization()\n",
|
||||||
|
" self.layernorm_2 = layers.LayerNormalization()\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs, mask=None):\n",
|
||||||
|
" if mask is not None:\n",
|
||||||
|
" mask = mask[:, tf.newaxis, :]\n",
|
||||||
|
" attention_output = self.attention(\n",
|
||||||
|
" inputs, inputs, attention_mask=mask)\n",
|
||||||
|
" proj_input = self.layernorm_1(inputs + attention_output)\n",
|
||||||
|
" proj_output = self.dense_proj(proj_input)\n",
|
||||||
|
" return self.layernorm_2(proj_input + proj_output)\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(TransformerEncoder, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"embed_dim\": self.embed_dim,\n",
|
||||||
|
" \"num_heads\": self.num_heads,\n",
|
||||||
|
" \"dense_dim\": self.dense_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Text classification model that combines the Transformer encoder and a pooling layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"vocab_size = 20000\n",
|
||||||
|
"embed_dim = 256\n",
|
||||||
|
"num_heads = 2\n",
|
||||||
|
"dense_dim = 32\n",
|
||||||
|
"\n",
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"x = layers.Embedding(vocab_size, embed_dim)(inputs)\n",
|
||||||
|
"x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
|
||||||
|
"x = layers.GlobalMaxPooling1D()(x)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training and evaluating the Transformer encoder based model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"transformer_encoder.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\n",
|
||||||
|
" \"transformer_encoder.keras\",\n",
|
||||||
|
" custom_objects={\"TransformerEncoder\": TransformerEncoder})\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Using positional encoding to reinject order information"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Implementing positional embedding as a subclassed layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class PositionalEmbedding(layers.Layer):\n",
|
||||||
|
" def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.token_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=input_dim, output_dim=output_dim)\n",
|
||||||
|
" self.position_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=sequence_length, output_dim=output_dim)\n",
|
||||||
|
" self.sequence_length = sequence_length\n",
|
||||||
|
" self.input_dim = input_dim\n",
|
||||||
|
" self.output_dim = output_dim\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs):\n",
|
||||||
|
" length = tf.shape(inputs)[-1]\n",
|
||||||
|
" positions = tf.range(start=0, limit=length, delta=1)\n",
|
||||||
|
" embedded_tokens = self.token_embeddings(inputs)\n",
|
||||||
|
" embedded_positions = self.position_embeddings(positions)\n",
|
||||||
|
" return embedded_tokens + embedded_positions\n",
|
||||||
|
"\n",
|
||||||
|
" def compute_mask(self, inputs, mask=None):\n",
|
||||||
|
" return tf.math.not_equal(inputs, 0)\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(PositionalEmbedding, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"output_dim\": self.output_dim,\n",
|
||||||
|
" \"sequence_length\": self.sequence_length,\n",
|
||||||
|
" \"input_dim\": self.input_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Putting it all together: a text-classification Transformer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Text classification model that combines positional embedding, the Transformer encoder, and a pooling layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"vocab_size = 20000\n",
|
||||||
|
"sequence_length = 600\n",
|
||||||
|
"embed_dim = 256\n",
|
||||||
|
"num_heads = 2\n",
|
||||||
|
"dense_dim = 32\n",
|
||||||
|
"\n",
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n",
|
||||||
|
"x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
|
||||||
|
"x = layers.GlobalMaxPooling1D()(x)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"binary_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"model.summary()\n",
|
||||||
|
"\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.ModelCheckpoint(\"full_transformer_encoder.keras\",\n",
|
||||||
|
" save_best_only=True)\n",
|
||||||
|
"]\n",
|
||||||
|
"model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)\n",
|
||||||
|
"model = keras.models.load_model(\n",
|
||||||
|
" \"full_transformer_encoder.keras\",\n",
|
||||||
|
" custom_objects={\"TransformerEncoder\": TransformerEncoder,\n",
|
||||||
|
" \"PositionalEmbedding\": PositionalEmbedding})\n",
|
||||||
|
"print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### When to use sequence models over bag-of-words models?"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter11_part03_transformer.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
622
chapter11_part04_sequence-to-sequence-learning.ipynb
Normal file
622
chapter11_part04_sequence-to-sequence-learning.ipynb
Normal file
@@ -0,0 +1,622 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Beyond text classification: sequence-to-sequence learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A machine translation example"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!wget http://storage.googleapis.com/download.tensorflow.org/data/spa-eng.zip\n",
|
||||||
|
"!unzip -q spa-eng.zip"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text_file = \"spa-eng/spa.txt\"\n",
|
||||||
|
"with open(text_file) as f:\n",
|
||||||
|
" lines = f.read().split(\"\\n\")[:-1]\n",
|
||||||
|
"text_pairs = []\n",
|
||||||
|
"for line in lines:\n",
|
||||||
|
" english, spanish = line.split(\"\\t\")\n",
|
||||||
|
" spanish = \"[start] \" + spanish + \" [end]\"\n",
|
||||||
|
" text_pairs.append((english, spanish))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import random\n",
|
||||||
|
"print(random.choice(text_pairs))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import random\n",
|
||||||
|
"random.shuffle(text_pairs)\n",
|
||||||
|
"num_val_samples = int(0.15 * len(text_pairs))\n",
|
||||||
|
"num_train_samples = len(text_pairs) - 2 * num_val_samples\n",
|
||||||
|
"train_pairs = text_pairs[:num_train_samples]\n",
|
||||||
|
"val_pairs = text_pairs[num_train_samples:num_train_samples + num_val_samples]\n",
|
||||||
|
"test_pairs = text_pairs[num_train_samples + num_val_samples:]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Vectorizing the English and Spanish text pairs**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from keras.layers.experimental.preprocessing import TextVectorization\n",
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"import string\n",
|
||||||
|
"import re\n",
|
||||||
|
"\n",
|
||||||
|
"strip_chars = string.punctuation + \"\u00bf\"\n",
|
||||||
|
"strip_chars = strip_chars.replace(\"[\", \"\")\n",
|
||||||
|
"strip_chars = strip_chars.replace(\"]\", \"\")\n",
|
||||||
|
"\n",
|
||||||
|
"def custom_standardization(input_string):\n",
|
||||||
|
" lowercase = tf.strings.lower(input_string)\n",
|
||||||
|
" return tf.strings.regex_replace(\n",
|
||||||
|
" lowercase, f\"[{re.escape(strip_chars)}]\", \"\")\n",
|
||||||
|
"\n",
|
||||||
|
"vocab_size = 15000\n",
|
||||||
|
"sequence_length = 20\n",
|
||||||
|
"\n",
|
||||||
|
"source_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=vocab_size,\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" output_sequence_length=sequence_length,\n",
|
||||||
|
")\n",
|
||||||
|
"target_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=vocab_size,\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" output_sequence_length=sequence_length + 1,\n",
|
||||||
|
" standardize=custom_standardization,\n",
|
||||||
|
")\n",
|
||||||
|
"train_english_texts = [pair[0] for pair in train_pairs]\n",
|
||||||
|
"train_spanish_texts = [pair[1] for pair in train_pairs]\n",
|
||||||
|
"source_vectorization.adapt(train_english_texts)\n",
|
||||||
|
"target_vectorization.adapt(train_spanish_texts)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing training and validation datasets for the translation task**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"batch_size = 64\n",
|
||||||
|
"\n",
|
||||||
|
"def format_dataset(eng, spa):\n",
|
||||||
|
" eng = source_vectorization(eng)\n",
|
||||||
|
" spa = target_vectorization(spa)\n",
|
||||||
|
" return ({\n",
|
||||||
|
" \"english\": eng,\n",
|
||||||
|
" \"spanish\": spa[:, :-1],\n",
|
||||||
|
" }, spa[:, 1:])\n",
|
||||||
|
"\n",
|
||||||
|
"def make_dataset(pairs):\n",
|
||||||
|
" eng_texts, spa_texts = zip(*pairs)\n",
|
||||||
|
" eng_texts = list(eng_texts)\n",
|
||||||
|
" spa_texts = list(spa_texts)\n",
|
||||||
|
" dataset = tf.data.Dataset.from_tensor_slices((eng_texts, spa_texts))\n",
|
||||||
|
" dataset = dataset.batch(batch_size)\n",
|
||||||
|
" dataset = dataset.map(format_dataset)\n",
|
||||||
|
" return dataset.shuffle(2048).prefetch(16).cache()\n",
|
||||||
|
"\n",
|
||||||
|
"train_ds = make_dataset(train_pairs)\n",
|
||||||
|
"val_ds = make_dataset(val_pairs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for inputs, targets in train_ds.take(1):\n",
|
||||||
|
" print(f\"inputs['english'].shape: {inputs['english'].shape}\")\n",
|
||||||
|
" print(f\"inputs['spanish'].shape: {inputs['spanish'].shape}\")\n",
|
||||||
|
" print(f\"targets.shape: {targets.shape}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Sequence-to-sequence learning with RNNs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**GRU-based encoder**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"embed_dim = 256\n",
|
||||||
|
"latent_dim = 1024\n",
|
||||||
|
"\n",
|
||||||
|
"source = keras.Input(shape=(None,), dtype=\"int64\", name=\"english\")\n",
|
||||||
|
"x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(source)\n",
|
||||||
|
"encoded_source = layers.Bidirectional(\n",
|
||||||
|
" layers.GRU(latent_dim), merge_mode=\"sum\")(x)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**GRU-based decoder and the end-to-end model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"past_target = keras.Input(shape=(None,), dtype=\"int64\", name=\"spanish\")\n",
|
||||||
|
"x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(past_target)\n",
|
||||||
|
"decoder_gru = layers.GRU(latent_dim, return_sequences=True)\n",
|
||||||
|
"x = decoder_gru(x, initial_state=encoded_source)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"target_next_step = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
|
||||||
|
"seq2seq_rnn = keras.Model([source, past_target], target_next_step)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training our recurrent sequence-to-sequence model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"seq2seq_rnn.compile(\n",
|
||||||
|
" optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"seq2seq_rnn.fit(train_ds, epochs=15, validation_data=val_ds)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Translating new sentences with our RNN encoder and decoder**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"spa_vocab = target_vectorization.get_vocabulary()\n",
|
||||||
|
"spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n",
|
||||||
|
"max_decoded_sentence_length = 20\n",
|
||||||
|
"\n",
|
||||||
|
"def decode_sequence(input_sentence):\n",
|
||||||
|
" tokenized_input_sentence = source_vectorization([input_sentence])\n",
|
||||||
|
" decoded_sentence = \"[start]\"\n",
|
||||||
|
" for i in range(max_decoded_sentence_length):\n",
|
||||||
|
" tokenized_target_sentence = target_vectorization([decoded_sentence])\n",
|
||||||
|
" next_token_predictions = seq2seq_rnn.predict(\n",
|
||||||
|
" [tokenized_input_sentence, tokenized_target_sentence])\n",
|
||||||
|
" sampled_token_index = np.argmax(next_token_predictions[0, i, :])\n",
|
||||||
|
" sampled_token = spa_index_lookup[sampled_token_index]\n",
|
||||||
|
" decoded_sentence += \" \" + sampled_token\n",
|
||||||
|
" if sampled_token == \"[end]\":\n",
|
||||||
|
" break\n",
|
||||||
|
" return decoded_sentence\n",
|
||||||
|
"\n",
|
||||||
|
"test_eng_texts = [pair[0] for pair in test_pairs]\n",
|
||||||
|
"for _ in range(20):\n",
|
||||||
|
" input_sentence = random.choice(test_eng_texts)\n",
|
||||||
|
" print(\"-\")\n",
|
||||||
|
" print(input_sentence)\n",
|
||||||
|
" print(decode_sequence(input_sentence))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Sequence-to-sequence learning with Transformer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### The Transformer decoder"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The TransformerDecoder**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class TransformerDecoder(layers.Layer):\n",
|
||||||
|
" def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.embed_dim = embed_dim\n",
|
||||||
|
" self.dense_dim = dense_dim\n",
|
||||||
|
" self.num_heads = num_heads\n",
|
||||||
|
" self.attention_1 = layers.MultiHeadAttention(\n",
|
||||||
|
" num_heads=num_heads, key_dim=embed_dim)\n",
|
||||||
|
" self.attention_2 = layers.MultiHeadAttention(\n",
|
||||||
|
" num_heads=num_heads, key_dim=embed_dim)\n",
|
||||||
|
" self.dense_proj = keras.Sequential(\n",
|
||||||
|
" [layers.Dense(dense_dim, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(embed_dim),]\n",
|
||||||
|
" )\n",
|
||||||
|
" self.layernorm_1 = layers.LayerNormalization()\n",
|
||||||
|
" self.layernorm_2 = layers.LayerNormalization()\n",
|
||||||
|
" self.layernorm_3 = layers.LayerNormalization()\n",
|
||||||
|
" self.supports_masking = True\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(TransformerDecoder, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"embed_dim\": self.embed_dim,\n",
|
||||||
|
" \"num_heads\": self.num_heads,\n",
|
||||||
|
" \"dense_dim\": self.dense_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config\n",
|
||||||
|
"\n",
|
||||||
|
" def get_causal_attention_mask(self, inputs):\n",
|
||||||
|
" input_shape = tf.shape(inputs)\n",
|
||||||
|
" batch_size, sequence_length = input_shape[0], input_shape[1]\n",
|
||||||
|
" i = tf.range(sequence_length)[:, tf.newaxis]\n",
|
||||||
|
" j = tf.range(sequence_length)\n",
|
||||||
|
" mask = tf.cast(i >= j, dtype=\"int32\")\n",
|
||||||
|
" mask = tf.reshape(mask, (1, input_shape[1], input_shape[1]))\n",
|
||||||
|
" mult = tf.concat(\n",
|
||||||
|
" [tf.expand_dims(batch_size, -1),\n",
|
||||||
|
" tf.constant([1, 1], dtype=tf.int32)], axis=0)\n",
|
||||||
|
" return tf.tile(mask, mult)\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs, encoder_outputs, mask=None):\n",
|
||||||
|
" causal_mask = self.get_causal_attention_mask(inputs)\n",
|
||||||
|
" if mask is not None:\n",
|
||||||
|
" padding_mask = tf.cast(\n",
|
||||||
|
" mask[:, tf.newaxis, :], dtype=\"int32\")\n",
|
||||||
|
" padding_mask = tf.minimum(padding_mask, causal_mask)\n",
|
||||||
|
" attention_output_1 = self.attention_1(\n",
|
||||||
|
" query=inputs,\n",
|
||||||
|
" value=inputs,\n",
|
||||||
|
" key=inputs,\n",
|
||||||
|
" attention_mask=causal_mask)\n",
|
||||||
|
" attention_output_1 = self.layernorm_1(inputs + attention_output_1)\n",
|
||||||
|
" attention_output_2 = self.attention_2(\n",
|
||||||
|
" query=attention_output_1,\n",
|
||||||
|
" value=encoder_outputs,\n",
|
||||||
|
" key=encoder_outputs,\n",
|
||||||
|
" attention_mask=padding_mask,\n",
|
||||||
|
" )\n",
|
||||||
|
" attention_output_2 = self.layernorm_2(\n",
|
||||||
|
" attention_output_1 + attention_output_2)\n",
|
||||||
|
" proj_output = self.dense_proj(attention_output_2)\n",
|
||||||
|
" return self.layernorm_3(attention_output_2 + proj_output)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Putting it all together: a Transformer for machine translation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**PositionalEmbedding layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class PositionalEmbedding(layers.Layer):\n",
|
||||||
|
" def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.token_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=input_dim, output_dim=output_dim)\n",
|
||||||
|
" self.position_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=sequence_length, output_dim=output_dim)\n",
|
||||||
|
" self.sequence_length = sequence_length\n",
|
||||||
|
" self.input_dim = input_dim\n",
|
||||||
|
" self.output_dim = output_dim\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs):\n",
|
||||||
|
" length = tf.shape(inputs)[-1]\n",
|
||||||
|
" positions = tf.range(start=0, limit=length, delta=1)\n",
|
||||||
|
" embedded_tokens = self.token_embeddings(inputs)\n",
|
||||||
|
" embedded_positions = self.position_embeddings(positions)\n",
|
||||||
|
" return embedded_tokens + embedded_positions\n",
|
||||||
|
"\n",
|
||||||
|
" def compute_mask(self, inputs, mask=None):\n",
|
||||||
|
" return tf.math.not_equal(inputs, 0)\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(PositionalEmbedding, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"output_dim\": self.output_dim,\n",
|
||||||
|
" \"sequence_length\": self.sequence_length,\n",
|
||||||
|
" \"input_dim\": self.input_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**End-to-end Transformer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"embed_dim = 256\n",
|
||||||
|
"dense_dim = 2048\n",
|
||||||
|
"num_heads = 8\n",
|
||||||
|
"\n",
|
||||||
|
"encoder_inputs = keras.Input(shape=(None,), dtype=\"int64\", name=\"english\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)\n",
|
||||||
|
"encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
|
||||||
|
"\n",
|
||||||
|
"decoder_inputs = keras.Input(shape=(None,), dtype=\"int64\", name=\"spanish\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)\n",
|
||||||
|
"x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)\n",
|
||||||
|
"x = layers.Dropout(0.5)(x)\n",
|
||||||
|
"decoder_outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
|
||||||
|
"transformer = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training the sequence-to-sequence Transformer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"transformer.compile(\n",
|
||||||
|
" optimizer=\"rmsprop\",\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
"transformer.fit(train_ds, epochs=30, validation_data=val_ds)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Translating new sentences with our Transformer model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"spa_vocab = target_vectorization.get_vocabulary()\n",
|
||||||
|
"spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n",
|
||||||
|
"max_decoded_sentence_length = 20\n",
|
||||||
|
"\n",
|
||||||
|
"def decode_sequence(input_sentence):\n",
|
||||||
|
" tokenized_input_sentence = source_vectorization([input_sentence])\n",
|
||||||
|
" decoded_sentence = \"[start]\"\n",
|
||||||
|
" for i in range(max_decoded_sentence_length):\n",
|
||||||
|
" tokenized_target_sentence = target_vectorization(\n",
|
||||||
|
" [decoded_sentence])[:, :-1]\n",
|
||||||
|
" predictions = transformer(\n",
|
||||||
|
" [tokenized_input_sentence, tokenized_target_sentence])\n",
|
||||||
|
" sampled_token_index = np.argmax(predictions[0, i, :])\n",
|
||||||
|
" sampled_token = spa_index_lookup[sampled_token_index]\n",
|
||||||
|
" decoded_sentence += \" \" + sampled_token\n",
|
||||||
|
" if sampled_token == \"[end]\":\n",
|
||||||
|
" break\n",
|
||||||
|
" return decoded_sentence\n",
|
||||||
|
"\n",
|
||||||
|
"test_eng_texts = [pair[0] for pair in test_pairs]\n",
|
||||||
|
"for _ in range(20):\n",
|
||||||
|
" input_sentence = random.choice(test_eng_texts)\n",
|
||||||
|
" print(\"-\")\n",
|
||||||
|
" print(input_sentence)\n",
|
||||||
|
" print(decode_sequence(input_sentence))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter11_part04_sequence-to-sequence-learning.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
477
chapter12_part01_text-generation.ipynb
Normal file
477
chapter12_part01_text-generation.ipynb
Normal file
@@ -0,0 +1,477 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Generative deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Text generation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A brief history of generative deep learning for sequence generation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### How do you generate sequence data?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The importance of the sampling strategy"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Reweighting a probability distribution to a different temperature**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"def reweight_distribution(original_distribution, temperature=0.5):\n",
|
||||||
|
" distribution = np.log(original_distribution) / temperature\n",
|
||||||
|
" distribution = np.exp(distribution)\n",
|
||||||
|
" return distribution / np.sum(distribution)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Implementing text generation with Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Preparing the data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Downloading and uncompressing the IMDB movie reviews dataset**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!wget https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n",
|
||||||
|
"!tar -xf aclImdb_v1.tar.gz"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Creating a Dataset that yields the content of a set of text files (one file = one sample)**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"dataset = keras.preprocessing.text_dataset_from_directory(\n",
|
||||||
|
" directory=\"aclImdb\", label_mode=None, batch_size=256)\n",
|
||||||
|
"dataset = dataset.map(lambda x: tf.strings.regex_replace(x, \"<br />\", \" \"))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Preparing a TextVectorization layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
|
||||||
|
"\n",
|
||||||
|
"sequence_length = 100\n",
|
||||||
|
"vocab_size = 15000\n",
|
||||||
|
"text_vectorization = TextVectorization(\n",
|
||||||
|
" max_tokens=vocab_size,\n",
|
||||||
|
" output_mode=\"int\",\n",
|
||||||
|
" output_sequence_length=sequence_length,\n",
|
||||||
|
")\n",
|
||||||
|
"text_vectorization.adapt(dataset)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Setting up a language modeling dataset**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def prepare_lm_dataset(text_batch):\n",
|
||||||
|
" vectorized_sequences = text_vectorization(text_batch)\n",
|
||||||
|
" x = vectorized_sequences[:, :-1]\n",
|
||||||
|
" y = vectorized_sequences[:, 1:]\n",
|
||||||
|
" return x, y\n",
|
||||||
|
"\n",
|
||||||
|
"lm_dataset = dataset.map(prepare_lm_dataset)\n",
|
||||||
|
"lm_dataset = lm_dataset.prefetch(8)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### A Transformer-based sequence-to-sequence model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"class PositionalEmbedding(layers.Layer):\n",
|
||||||
|
" def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.token_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=input_dim, output_dim=output_dim)\n",
|
||||||
|
" self.position_embeddings = layers.Embedding(\n",
|
||||||
|
" input_dim=sequence_length, output_dim=output_dim)\n",
|
||||||
|
" self.sequence_length = sequence_length\n",
|
||||||
|
" self.input_dim = input_dim\n",
|
||||||
|
" self.output_dim = output_dim\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs):\n",
|
||||||
|
" length = tf.shape(inputs)[-1]\n",
|
||||||
|
" positions = tf.range(start=0, limit=length, delta=1)\n",
|
||||||
|
" embedded_tokens = self.token_embeddings(inputs)\n",
|
||||||
|
" embedded_positions = self.position_embeddings(positions)\n",
|
||||||
|
" return embedded_tokens + embedded_positions\n",
|
||||||
|
"\n",
|
||||||
|
" def compute_mask(self, inputs, mask=None):\n",
|
||||||
|
" return tf.math.not_equal(inputs, 0)\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(PositionalEmbedding, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"output_dim\": self.output_dim,\n",
|
||||||
|
" \"sequence_length\": self.sequence_length,\n",
|
||||||
|
" \"input_dim\": self.input_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"class TransformerDecoder(layers.Layer):\n",
|
||||||
|
" def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.embed_dim = embed_dim\n",
|
||||||
|
" self.dense_dim = dense_dim\n",
|
||||||
|
" self.num_heads = num_heads\n",
|
||||||
|
" self.attention_1 = layers.MultiHeadAttention(\n",
|
||||||
|
" num_heads=num_heads, key_dim=embed_dim)\n",
|
||||||
|
" self.attention_2 = layers.MultiHeadAttention(\n",
|
||||||
|
" num_heads=num_heads, key_dim=embed_dim)\n",
|
||||||
|
" self.dense_proj = keras.Sequential(\n",
|
||||||
|
" [layers.Dense(dense_dim, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(embed_dim),]\n",
|
||||||
|
" )\n",
|
||||||
|
" self.layernorm_1 = layers.LayerNormalization()\n",
|
||||||
|
" self.layernorm_2 = layers.LayerNormalization()\n",
|
||||||
|
" self.layernorm_3 = layers.LayerNormalization()\n",
|
||||||
|
" self.supports_masking = True\n",
|
||||||
|
"\n",
|
||||||
|
" def get_config(self):\n",
|
||||||
|
" config = super(TransformerDecoder, self).get_config()\n",
|
||||||
|
" config.update({\n",
|
||||||
|
" \"embed_dim\": self.embed_dim,\n",
|
||||||
|
" \"num_heads\": self.num_heads,\n",
|
||||||
|
" \"dense_dim\": self.dense_dim,\n",
|
||||||
|
" })\n",
|
||||||
|
" return config\n",
|
||||||
|
"\n",
|
||||||
|
" def get_causal_attention_mask(self, inputs):\n",
|
||||||
|
" input_shape = tf.shape(inputs)\n",
|
||||||
|
" batch_size, sequence_length = input_shape[0], input_shape[1]\n",
|
||||||
|
" i = tf.range(sequence_length)[:, tf.newaxis]\n",
|
||||||
|
" j = tf.range(sequence_length)\n",
|
||||||
|
" mask = tf.cast(i >= j, dtype=\"int32\")\n",
|
||||||
|
" mask = tf.reshape(mask, (1, input_shape[1], input_shape[1]))\n",
|
||||||
|
" mult = tf.concat(\n",
|
||||||
|
" [tf.expand_dims(batch_size, -1),\n",
|
||||||
|
" tf.constant([1, 1], dtype=tf.int32)], axis=0)\n",
|
||||||
|
" return tf.tile(mask, mult)\n",
|
||||||
|
"\n",
|
||||||
|
" def call(self, inputs, encoder_outputs, mask=None):\n",
|
||||||
|
" causal_mask = self.get_causal_attention_mask(inputs)\n",
|
||||||
|
" if mask is not None:\n",
|
||||||
|
" padding_mask = tf.cast(\n",
|
||||||
|
" mask[:, tf.newaxis, :], dtype=\"int32\")\n",
|
||||||
|
" padding_mask = tf.minimum(padding_mask, causal_mask)\n",
|
||||||
|
" attention_output_1 = self.attention_1(\n",
|
||||||
|
" query=inputs,\n",
|
||||||
|
" value=inputs,\n",
|
||||||
|
" key=inputs,\n",
|
||||||
|
" attention_mask=causal_mask)\n",
|
||||||
|
" attention_output_1 = self.layernorm_1(inputs + attention_output_1)\n",
|
||||||
|
" attention_output_2 = self.attention_2(\n",
|
||||||
|
" query=attention_output_1,\n",
|
||||||
|
" value=encoder_outputs,\n",
|
||||||
|
" key=encoder_outputs,\n",
|
||||||
|
" attention_mask=padding_mask,\n",
|
||||||
|
" )\n",
|
||||||
|
" attention_output_2 = self.layernorm_2(\n",
|
||||||
|
" attention_output_1 + attention_output_2)\n",
|
||||||
|
" proj_output = self.dense_proj(attention_output_2)\n",
|
||||||
|
" return self.layernorm_3(attention_output_2 + proj_output)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A simple Transformer-based language model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"embed_dim = 256\n",
|
||||||
|
"latent_dim = 2048\n",
|
||||||
|
"num_heads = 2\n",
|
||||||
|
"\n",
|
||||||
|
"inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n",
|
||||||
|
"x = TransformerDecoder(embed_dim, latent_dim, num_heads)(x, x)\n",
|
||||||
|
"outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"rmsprop\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A text-generation callback with variable-temperature sampling"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The text generation callback**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"tokens_index = dict(enumerate(text_vectorization.get_vocabulary()))\n",
|
||||||
|
"\n",
|
||||||
|
"def sample_next(predictions, temperature=1.0):\n",
|
||||||
|
" predictions = np.asarray(predictions).astype(\"float64\")\n",
|
||||||
|
" predictions = np.log(predictions) / temperature\n",
|
||||||
|
" exp_preds = np.exp(predictions)\n",
|
||||||
|
" predictions = exp_preds / np.sum(exp_preds)\n",
|
||||||
|
" probas = np.random.multinomial(1, predictions, 1)\n",
|
||||||
|
" return np.argmax(probas)\n",
|
||||||
|
"\n",
|
||||||
|
"class TextGenerator(keras.callbacks.Callback):\n",
|
||||||
|
"\n",
|
||||||
|
" def __init__(self,\n",
|
||||||
|
" prompt,\n",
|
||||||
|
" generate_length,\n",
|
||||||
|
" model_input_length,\n",
|
||||||
|
" temperatures=(1.,),\n",
|
||||||
|
" print_freq=1):\n",
|
||||||
|
" self.prompt = prompt\n",
|
||||||
|
" self.generate_length = generate_length\n",
|
||||||
|
" self.model_input_length = model_input_length\n",
|
||||||
|
" self.temperatures = temperatures\n",
|
||||||
|
" self.print_freq = print_freq\n",
|
||||||
|
"\n",
|
||||||
|
" def on_epoch_end(self, epoch, logs=None):\n",
|
||||||
|
" if (epoch + 1) % self.print_freq != 0:\n",
|
||||||
|
" return\n",
|
||||||
|
" for temperature in self.temperatures:\n",
|
||||||
|
" print(\"== Generating with temperature\", temperature)\n",
|
||||||
|
" sentence = self.prompt\n",
|
||||||
|
" for i in range(self.generate_length):\n",
|
||||||
|
" tokenized_sentence = text_vectorization([sentence])\n",
|
||||||
|
" predictions = self.model(tokenized_sentence)\n",
|
||||||
|
" next_token = sample_next(predictions[0, i, :])\n",
|
||||||
|
" sampled_token = tokens_index[next_token]\n",
|
||||||
|
" sentence += \" \" + sampled_token\n",
|
||||||
|
" print(sentence)\n",
|
||||||
|
"\n",
|
||||||
|
"prompt = \"This movie\"\n",
|
||||||
|
"text_gen_callback = TextGenerator(\n",
|
||||||
|
" prompt,\n",
|
||||||
|
" generate_length=50,\n",
|
||||||
|
" model_input_length=sequence_length,\n",
|
||||||
|
" temperatures=(0.2, 0.5, 0.7, 1., 1.5))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Fitting the language model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model.fit(lm_dataset, epochs=200, callbacks=[text_gen_callback])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Wrapping up"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter12_part01_text-generation.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
245
chapter12_part02_deep-dream.ipynb
Normal file
245
chapter12_part02_deep-dream.ipynb
Normal file
@@ -0,0 +1,245 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## DeepDream"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Implementing DeepDream in Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"\n",
|
||||||
|
"base_image_path = keras.utils.get_file(\n",
|
||||||
|
" \"coast.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/coast.jpg\")\n",
|
||||||
|
"\n",
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(keras.preprocessing.image.load_img(base_image_path))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras.applications import inception_v3\n",
|
||||||
|
"model = inception_v3.InceptionV3(weights=\"imagenet\", include_top=False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"layer_settings = {\n",
|
||||||
|
" \"mixed4\": 1.0,\n",
|
||||||
|
" \"mixed5\": 1.5,\n",
|
||||||
|
" \"mixed6\": 2.0,\n",
|
||||||
|
" \"mixed7\": 2.5,\n",
|
||||||
|
"}\n",
|
||||||
|
"outputs_dict = dict(\n",
|
||||||
|
" [\n",
|
||||||
|
" (layer.name, layer.output)\n",
|
||||||
|
" for layer in [model.get_layer(name) for name in layer_settings.keys()]\n",
|
||||||
|
" ]\n",
|
||||||
|
")\n",
|
||||||
|
"feature_extractor = keras.Model(inputs=model.inputs, outputs=outputs_dict)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def compute_loss(input_image):\n",
|
||||||
|
" features = feature_extractor(input_image)\n",
|
||||||
|
" loss = tf.zeros(shape=())\n",
|
||||||
|
" for name in features.keys():\n",
|
||||||
|
" coeff = layer_settings[name]\n",
|
||||||
|
" activation = features[name]\n",
|
||||||
|
" loss += coeff * tf.reduce_mean(tf.square(activation[:, 2:-2, 2:-2, :]))\n",
|
||||||
|
" return loss"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"@tf.function\n",
|
||||||
|
"def gradient_ascent_step(image, learning_rate):\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" tape.watch(image)\n",
|
||||||
|
" loss = compute_loss(image)\n",
|
||||||
|
" grads = tape.gradient(loss, image)\n",
|
||||||
|
" grads = tf.math.l2_normalize(grads)\n",
|
||||||
|
" image += learning_rate * grads\n",
|
||||||
|
" return loss, image\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"def gradient_ascent_loop(image, iterations, learning_rate, max_loss=None):\n",
|
||||||
|
" for i in range(iterations):\n",
|
||||||
|
" loss, image = gradient_ascent_step(image, learning_rate)\n",
|
||||||
|
" if max_loss is not None and loss > max_loss:\n",
|
||||||
|
" break\n",
|
||||||
|
" print(f\"... Loss value at step {i}: {loss:.2f}\")\n",
|
||||||
|
" return image"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"step = 20.\n",
|
||||||
|
"num_octave = 3\n",
|
||||||
|
"octave_scale = 1.4\n",
|
||||||
|
"iterations = 30\n",
|
||||||
|
"max_loss = 15."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"def preprocess_image(image_path):\n",
|
||||||
|
" img = keras.preprocessing.image.load_img(image_path)\n",
|
||||||
|
" img = keras.preprocessing.image.img_to_array(img)\n",
|
||||||
|
" img = np.expand_dims(img, axis=0)\n",
|
||||||
|
" img = keras.applications.inception_v3.preprocess_input(img)\n",
|
||||||
|
" return img\n",
|
||||||
|
"\n",
|
||||||
|
"def deprocess_image(img):\n",
|
||||||
|
" img = img.reshape((img.shape[1], img.shape[2], 3))\n",
|
||||||
|
" img /= 2.0\n",
|
||||||
|
" img += 0.5\n",
|
||||||
|
" img *= 255.\n",
|
||||||
|
" img = np.clip(img, 0, 255).astype(\"uint8\")\n",
|
||||||
|
" return img"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"original_img = preprocess_image(base_image_path)\n",
|
||||||
|
"original_shape = original_img.shape[1:3]\n",
|
||||||
|
"\n",
|
||||||
|
"successive_shapes = [original_shape]\n",
|
||||||
|
"for i in range(1, num_octave):\n",
|
||||||
|
" shape = tuple([int(dim / (octave_scale ** i)) for dim in original_shape])\n",
|
||||||
|
" successive_shapes.append(shape)\n",
|
||||||
|
"successive_shapes = successive_shapes[::-1]\n",
|
||||||
|
"\n",
|
||||||
|
"shrunk_original_img = tf.image.resize(original_img, successive_shapes[0])\n",
|
||||||
|
"\n",
|
||||||
|
"img = tf.identity(original_img)\n",
|
||||||
|
"for i, shape in enumerate(successive_shapes):\n",
|
||||||
|
" print(f\"Processing octave {i} with shape {shape}\")\n",
|
||||||
|
" img = tf.image.resize(img, shape)\n",
|
||||||
|
" img = gradient_ascent_loop(\n",
|
||||||
|
" img, iterations=iterations, learning_rate=step, max_loss=max_loss\n",
|
||||||
|
" )\n",
|
||||||
|
" upscaled_shrunk_original_img = tf.image.resize(shrunk_original_img, shape)\n",
|
||||||
|
" same_size_original = tf.image.resize(original_img, shape)\n",
|
||||||
|
" lost_detail = same_size_original - upscaled_shrunk_original_img\n",
|
||||||
|
" img += lost_detail\n",
|
||||||
|
" shrunk_original_img = tf.image.resize(original_img, shape)\n",
|
||||||
|
"\n",
|
||||||
|
"keras.preprocessing.image.save_img(\"dream.png\", deprocess_image(img.numpy()))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Wrapping up"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter12_part02_deep-dream.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
356
chapter12_part03_neural-style-transfer.ipynb
Normal file
356
chapter12_part03_neural-style-transfer.ipynb
Normal file
@@ -0,0 +1,356 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Neural style transfer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The content loss"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The style loss"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Neural style transfer in Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Getting the style and content images**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"\n",
|
||||||
|
"base_image_path = keras.utils.get_file(\n",
|
||||||
|
" \"sf.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/sf.jpg\")\n",
|
||||||
|
"style_reference_image_path = keras.utils.get_file(\n",
|
||||||
|
" \"starry_night.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/starry_night.jpg\")\n",
|
||||||
|
"\n",
|
||||||
|
"original_width, original_height = keras.preprocessing.image.load_img(base_image_path).size\n",
|
||||||
|
"img_height = 400\n",
|
||||||
|
"img_width = round(original_width * img_height / original_height)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Auxiliary functions**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"def preprocess_image(image_path):\n",
|
||||||
|
" img = keras.preprocessing.image.load_img(\n",
|
||||||
|
" image_path, target_size=(img_height, img_width))\n",
|
||||||
|
" img = keras.preprocessing.image.img_to_array(img)\n",
|
||||||
|
" img = np.expand_dims(img, axis=0)\n",
|
||||||
|
" img = keras.applications.vgg19.preprocess_input(img)\n",
|
||||||
|
" return img\n",
|
||||||
|
"\n",
|
||||||
|
"def deprocess_image(img):\n",
|
||||||
|
" img = img.reshape((img_height, img_width, 3))\n",
|
||||||
|
" img[:, :, 0] += 103.939\n",
|
||||||
|
" img[:, :, 1] += 116.779\n",
|
||||||
|
" img[:, :, 2] += 123.68\n",
|
||||||
|
" img = img[:, :, ::-1]\n",
|
||||||
|
" img = np.clip(img, 0, 255).astype(\"uint8\")\n",
|
||||||
|
" return img"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Loading the pretrained VGG19 network and using it to define a feature extractor**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model = keras.applications.vgg19.VGG19(weights=\"imagenet\", include_top=False)\n",
|
||||||
|
"\n",
|
||||||
|
"outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])\n",
|
||||||
|
"feature_extractor = keras.Model(inputs=model.inputs, outputs=outputs_dict)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Content loss**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def content_loss(base_img, combination_img):\n",
|
||||||
|
" return tf.reduce_sum(tf.square(combination_img - base_img))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Style loss**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def gram_matrix(x):\n",
|
||||||
|
" x = tf.transpose(x, (2, 0, 1))\n",
|
||||||
|
" features = tf.reshape(x, (tf.shape(x)[0], -1))\n",
|
||||||
|
" gram = tf.matmul(features, tf.transpose(features))\n",
|
||||||
|
" return gram\n",
|
||||||
|
"\n",
|
||||||
|
"def style_loss(style_img, combination_img):\n",
|
||||||
|
" S = gram_matrix(style_img)\n",
|
||||||
|
" C = gram_matrix(combination_img)\n",
|
||||||
|
" channels = 3\n",
|
||||||
|
" size = img_height * img_width\n",
|
||||||
|
" return tf.reduce_sum(tf.square(S - C)) / (4.0 * (channels ** 2) * (size ** 2))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Total variation loss**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def total_variation_loss(x):\n",
|
||||||
|
" a = tf.square(\n",
|
||||||
|
" x[:, : img_height - 1, : img_width - 1, :] - x[:, 1:, : img_width - 1, :]\n",
|
||||||
|
" )\n",
|
||||||
|
" b = tf.square(\n",
|
||||||
|
" x[:, : img_height - 1, : img_width - 1, :] - x[:, : img_height - 1, 1:, :]\n",
|
||||||
|
" )\n",
|
||||||
|
" return tf.reduce_sum(tf.pow(a + b, 1.25))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Defining the final loss that you'll minimize**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"style_layer_names = [\n",
|
||||||
|
" \"block1_conv1\",\n",
|
||||||
|
" \"block2_conv1\",\n",
|
||||||
|
" \"block3_conv1\",\n",
|
||||||
|
" \"block4_conv1\",\n",
|
||||||
|
" \"block5_conv1\",\n",
|
||||||
|
"]\n",
|
||||||
|
"content_layer_name = \"block5_conv2\"\n",
|
||||||
|
"total_variation_weight = 1e-6\n",
|
||||||
|
"style_weight = 1e-6\n",
|
||||||
|
"content_weight = 2.5e-8\n",
|
||||||
|
"\n",
|
||||||
|
"def compute_loss(combination_image, base_image, style_reference_image):\n",
|
||||||
|
" input_tensor = tf.concat(\n",
|
||||||
|
" [base_image, style_reference_image, combination_image], axis=0\n",
|
||||||
|
" )\n",
|
||||||
|
" features = feature_extractor(input_tensor)\n",
|
||||||
|
" loss = tf.zeros(shape=())\n",
|
||||||
|
" layer_features = features[content_layer_name]\n",
|
||||||
|
" base_image_features = layer_features[0, :, :, :]\n",
|
||||||
|
" combination_features = layer_features[2, :, :, :]\n",
|
||||||
|
" loss = loss + content_weight * content_loss(\n",
|
||||||
|
" base_image_features, combination_features\n",
|
||||||
|
" )\n",
|
||||||
|
" for layer_name in style_layer_names:\n",
|
||||||
|
" layer_features = features[layer_name]\n",
|
||||||
|
" style_reference_features = layer_features[1, :, :, :]\n",
|
||||||
|
" combination_features = layer_features[2, :, :, :]\n",
|
||||||
|
" style_loss_value = style_loss(\n",
|
||||||
|
" style_reference_features, combination_features)\n",
|
||||||
|
" loss += (style_weight / len(style_layer_names)) * style_loss_value\n",
|
||||||
|
"\n",
|
||||||
|
" loss += total_variation_weight * total_variation_loss(combination_image)\n",
|
||||||
|
" return loss"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Setting up the gradient-descent process**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"@tf.function\n",
|
||||||
|
"def compute_loss_and_grads(combination_image, base_image, style_reference_image):\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" loss = compute_loss(combination_image, base_image, style_reference_image)\n",
|
||||||
|
" grads = tape.gradient(loss, combination_image)\n",
|
||||||
|
" return loss, grads\n",
|
||||||
|
"\n",
|
||||||
|
"optimizer = keras.optimizers.SGD(\n",
|
||||||
|
" keras.optimizers.schedules.ExponentialDecay(\n",
|
||||||
|
" initial_learning_rate=100.0, decay_steps=100, decay_rate=0.96\n",
|
||||||
|
" )\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"base_image = preprocess_image(base_image_path)\n",
|
||||||
|
"style_reference_image = preprocess_image(style_reference_image_path)\n",
|
||||||
|
"combination_image = tf.Variable(preprocess_image(base_image_path))\n",
|
||||||
|
"\n",
|
||||||
|
"iterations = 4000\n",
|
||||||
|
"for i in range(1, iterations + 1):\n",
|
||||||
|
" loss, grads = compute_loss_and_grads(\n",
|
||||||
|
" combination_image, base_image, style_reference_image\n",
|
||||||
|
" )\n",
|
||||||
|
" optimizer.apply_gradients([(grads, combination_image)])\n",
|
||||||
|
" if i % 100 == 0:\n",
|
||||||
|
" print(f\"Iteration {i}: loss={loss:.2f}\")\n",
|
||||||
|
" img = deprocess_image(combination_image.numpy())\n",
|
||||||
|
" fname = f\"combination_image_at_iteration_{i}.png\"\n",
|
||||||
|
" keras.preprocessing.image.save_img(fname, img)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Wrapping up"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter12_part03_neural-style-transfer.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
339
chapter12_part04_variational-autoencoders.ipynb
Normal file
339
chapter12_part04_variational-autoencoders.ipynb
Normal file
@@ -0,0 +1,339 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Generating images with variational autoencoders"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Sampling from latent spaces of images"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Concept vectors for image editing"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Variational autoencoders"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Implementing a VAE with Keras"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**VAE encoder network**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"latent_dim = 2\n",
|
||||||
|
"\n",
|
||||||
|
"encoder_inputs = keras.Input(shape=(28, 28, 1))\n",
|
||||||
|
"x = layers.Conv2D(32, 3, activation=\"relu\", strides=2, padding=\"same\")(encoder_inputs)\n",
|
||||||
|
"x = layers.Conv2D(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n",
|
||||||
|
"x = layers.Flatten()(x)\n",
|
||||||
|
"x = layers.Dense(16, activation=\"relu\")(x)\n",
|
||||||
|
"z_mean = layers.Dense(latent_dim, name=\"z_mean\")(x)\n",
|
||||||
|
"z_log_var = layers.Dense(latent_dim, name=\"z_log_var\")(x)\n",
|
||||||
|
"encoder = keras.Model(encoder_inputs, [z_mean, z_log_var], name=\"encoder\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"encoder.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Latent-space-sampling layer**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"\n",
|
||||||
|
"class Sampler(layers.Layer):\n",
|
||||||
|
" def call(self, z_mean, z_log_var):\n",
|
||||||
|
" batch_size = tf.shape(z_mean)[0]\n",
|
||||||
|
" z_size = tf.shape(z_mean)[1]\n",
|
||||||
|
" epsilon = tf.random.normal(shape=(batch_size, z_size))\n",
|
||||||
|
" return z_mean + tf.exp(0.5 * z_log_var) * epsilon"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**VAE decoder network, mapping latent space points to images**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"latent_inputs = keras.Input(shape=(latent_dim,))\n",
|
||||||
|
"x = layers.Dense(7 * 7 * 64, activation=\"relu\")(latent_inputs)\n",
|
||||||
|
"x = layers.Reshape((7, 7, 64))(x)\n",
|
||||||
|
"x = layers.Conv2DTranspose(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n",
|
||||||
|
"x = layers.Conv2DTranspose(32, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n",
|
||||||
|
"decoder_outputs = layers.Conv2D(1, 3, activation=\"sigmoid\", padding=\"same\")(x)\n",
|
||||||
|
"decoder = keras.Model(latent_inputs, decoder_outputs, name=\"decoder\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"decoder.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**VAE model with custom `train_step()`**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class VAE(keras.Model):\n",
|
||||||
|
" def __init__(self, encoder, decoder, **kwargs):\n",
|
||||||
|
" super().__init__(**kwargs)\n",
|
||||||
|
" self.encoder = encoder\n",
|
||||||
|
" self.decoder = decoder\n",
|
||||||
|
" self.sampler = Sampler()\n",
|
||||||
|
" self.total_loss_tracker = keras.metrics.Mean(name=\"total_loss\")\n",
|
||||||
|
" self.reconstruction_loss_tracker = keras.metrics.Mean(\n",
|
||||||
|
" name=\"reconstruction_loss\")\n",
|
||||||
|
" self.kl_loss_tracker = keras.metrics.Mean(name=\"kl_loss\")\n",
|
||||||
|
"\n",
|
||||||
|
" @property\n",
|
||||||
|
" def metrics(self):\n",
|
||||||
|
" return [self.total_loss_tracker,\n",
|
||||||
|
" self.reconstruction_loss_tracker,\n",
|
||||||
|
" self.kl_loss_tracker]\n",
|
||||||
|
"\n",
|
||||||
|
" def train_step(self, data):\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" z_mean, z_log_var = self.encoder(data)\n",
|
||||||
|
" z = self.sampler(z_mean, z_log_var)\n",
|
||||||
|
" reconstruction = decoder(z)\n",
|
||||||
|
" reconstruction_loss = tf.reduce_mean(\n",
|
||||||
|
" tf.reduce_sum(\n",
|
||||||
|
" keras.losses.binary_crossentropy(data, reconstruction),\n",
|
||||||
|
" axis=(1, 2)\n",
|
||||||
|
" )\n",
|
||||||
|
" )\n",
|
||||||
|
" kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))\n",
|
||||||
|
" total_loss = reconstruction_loss + tf.reduce_mean(kl_loss)\n",
|
||||||
|
" grads = tape.gradient(total_loss, self.trainable_weights)\n",
|
||||||
|
" self.optimizer.apply_gradients(zip(grads, self.trainable_weights))\n",
|
||||||
|
" self.total_loss_tracker.update_state(total_loss)\n",
|
||||||
|
" self.reconstruction_loss_tracker.update_state(reconstruction_loss)\n",
|
||||||
|
" self.kl_loss_tracker.update_state(kl_loss)\n",
|
||||||
|
" return {\n",
|
||||||
|
" \"total_loss\": self.total_loss_tracker.result(),\n",
|
||||||
|
" \"reconstruction_loss\": self.reconstruction_loss_tracker.result(),\n",
|
||||||
|
" \"kl_loss\": self.kl_loss_tracker.result(),\n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Training the VAE**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import numpy as np\n",
|
||||||
|
"\n",
|
||||||
|
"(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()\n",
|
||||||
|
"mnist_digits = np.concatenate([x_train, x_test], axis=0)\n",
|
||||||
|
"mnist_digits = np.expand_dims(mnist_digits, -1).astype(\"float32\") / 255\n",
|
||||||
|
"\n",
|
||||||
|
"vae = VAE(encoder, decoder)\n",
|
||||||
|
"vae.compile(optimizer=keras.optimizers.Adam(), run_eagerly=True)\n",
|
||||||
|
"vae.fit(mnist_digits, epochs=30, batch_size=128)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Sampling a grid of points from the 2D latent space and decoding them to images**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"\n",
|
||||||
|
"n = 30\n",
|
||||||
|
"digit_size = 28\n",
|
||||||
|
"figure = np.zeros((digit_size * n, digit_size * n))\n",
|
||||||
|
"\n",
|
||||||
|
"grid_x = np.linspace(-1, 1, n)\n",
|
||||||
|
"grid_y = np.linspace(-1, 1, n)[::-1]\n",
|
||||||
|
"\n",
|
||||||
|
"for i, yi in enumerate(grid_y):\n",
|
||||||
|
" for j, xi in enumerate(grid_x):\n",
|
||||||
|
" z_sample = np.array([[xi, yi]])\n",
|
||||||
|
" x_decoded = vae.decoder.predict(z_sample)\n",
|
||||||
|
" digit = x_decoded[0].reshape(digit_size, digit_size)\n",
|
||||||
|
" figure[\n",
|
||||||
|
" i * digit_size : (i + 1) * digit_size,\n",
|
||||||
|
" j * digit_size : (j + 1) * digit_size,\n",
|
||||||
|
" ] = digit\n",
|
||||||
|
"\n",
|
||||||
|
"plt.figure(figsize=(15, 15))\n",
|
||||||
|
"start_range = digit_size // 2\n",
|
||||||
|
"end_range = n * digit_size + start_range\n",
|
||||||
|
"pixel_range = np.arange(start_range, end_range, digit_size)\n",
|
||||||
|
"sample_range_x = np.round(grid_x, 1)\n",
|
||||||
|
"sample_range_y = np.round(grid_y, 1)\n",
|
||||||
|
"plt.xticks(pixel_range, sample_range_x)\n",
|
||||||
|
"plt.yticks(pixel_range, sample_range_y)\n",
|
||||||
|
"plt.xlabel(\"z[0]\")\n",
|
||||||
|
"plt.ylabel(\"z[1]\")\n",
|
||||||
|
"plt.axis(\"off\")\n",
|
||||||
|
"plt.imshow(figure, cmap=\"Greys_r\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Wrapping up"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter12_part04_variational-autoencoders.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
441
chapter12_part05_gans.ipynb
Normal file
441
chapter12_part05_gans.ipynb
Normal file
@@ -0,0 +1,441 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Introduction to generative adversarial networks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A schematic GAN implementation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A bag of tricks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Getting our hands on the CelebA dataset"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Getting the CelebA data**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!mkdir celeba_gan\n",
|
||||||
|
"!gdown --id 1O7m1010EJjLE5QxLZiM9Fpjs7Oj6e684 -O celeba_gan/data.zip\n",
|
||||||
|
"!unzip -qq celeba_gan/data.zip -d celeba_gan"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Creating a Dataset from a directory of images**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"dataset = keras.preprocessing.image_dataset_from_directory(\n",
|
||||||
|
" \"celeba_gan\",\n",
|
||||||
|
" label_mode=None,\n",
|
||||||
|
" image_size=(64, 64),\n",
|
||||||
|
" batch_size=32,\n",
|
||||||
|
" smart_resize=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Rescaling the images**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"dataset = dataset.map(lambda x: x / 255.)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Displaying the first image**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import matplotlib.pyplot as plt\n",
|
||||||
|
"for x in dataset:\n",
|
||||||
|
" plt.axis(\"off\")\n",
|
||||||
|
" plt.imshow((x.numpy() * 255).astype(\"int32\")[0])\n",
|
||||||
|
" break"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The discriminator"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The GAN discriminator network**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"discriminator = keras.Sequential(\n",
|
||||||
|
" [\n",
|
||||||
|
" keras.Input(shape=(64, 64, 3)),\n",
|
||||||
|
" layers.Conv2D(64, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Conv2D(128, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Conv2D(128, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Flatten(),\n",
|
||||||
|
" layers.Dropout(0.2),\n",
|
||||||
|
" layers.Dense(1, activation=\"sigmoid\"),\n",
|
||||||
|
" ],\n",
|
||||||
|
" name=\"discriminator\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"discriminator.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The generator"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**GAN generator network**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"latent_dim = 128\n",
|
||||||
|
"\n",
|
||||||
|
"generator = keras.Sequential(\n",
|
||||||
|
" [\n",
|
||||||
|
" keras.Input(shape=(latent_dim,)),\n",
|
||||||
|
" layers.Dense(8 * 8 * 128),\n",
|
||||||
|
" layers.Reshape((8, 8, 128)),\n",
|
||||||
|
" layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Conv2DTranspose(256, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Conv2DTranspose(512, kernel_size=4, strides=2, padding=\"same\"),\n",
|
||||||
|
" layers.LeakyReLU(alpha=0.2),\n",
|
||||||
|
" layers.Conv2D(3, kernel_size=5, padding=\"same\", activation=\"sigmoid\"),\n",
|
||||||
|
" ],\n",
|
||||||
|
" name=\"generator\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"generator.summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The adversarial network"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**The GAN Model**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class GAN(keras.Model):\n",
|
||||||
|
" def __init__(self, discriminator, generator, latent_dim):\n",
|
||||||
|
" super().__init__()\n",
|
||||||
|
" self.discriminator = discriminator\n",
|
||||||
|
" self.generator = generator\n",
|
||||||
|
" self.latent_dim = latent_dim\n",
|
||||||
|
" self.d_loss_metric = keras.metrics.Mean(name=\"d_loss\")\n",
|
||||||
|
" self.g_loss_metric = keras.metrics.Mean(name=\"g_loss\")\n",
|
||||||
|
"\n",
|
||||||
|
" def compile(self, d_optimizer, g_optimizer, loss_fn):\n",
|
||||||
|
" super(GAN, self).compile()\n",
|
||||||
|
" self.d_optimizer = d_optimizer\n",
|
||||||
|
" self.g_optimizer = g_optimizer\n",
|
||||||
|
" self.loss_fn = loss_fn\n",
|
||||||
|
"\n",
|
||||||
|
" @property\n",
|
||||||
|
" def metrics(self):\n",
|
||||||
|
" return [self.d_loss_metric, self.g_loss_metric]\n",
|
||||||
|
"\n",
|
||||||
|
" def train_step(self, real_images):\n",
|
||||||
|
" batch_size = tf.shape(real_images)[0]\n",
|
||||||
|
" random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))\n",
|
||||||
|
" generated_images = self.generator(random_latent_vectors)\n",
|
||||||
|
" combined_images = tf.concat([generated_images, real_images], axis=0)\n",
|
||||||
|
" labels = tf.concat(\n",
|
||||||
|
" [tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))], axis=0\n",
|
||||||
|
" )\n",
|
||||||
|
" labels += 0.05 * tf.random.uniform(tf.shape(labels))\n",
|
||||||
|
"\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" predictions = self.discriminator(combined_images)\n",
|
||||||
|
" d_loss = self.loss_fn(labels, predictions)\n",
|
||||||
|
" grads = tape.gradient(d_loss, self.discriminator.trainable_weights)\n",
|
||||||
|
" self.d_optimizer.apply_gradients(\n",
|
||||||
|
" zip(grads, self.discriminator.trainable_weights)\n",
|
||||||
|
" )\n",
|
||||||
|
"\n",
|
||||||
|
" random_latent_vectors = tf.random.normal(\n",
|
||||||
|
" shape=(batch_size, self.latent_dim))\n",
|
||||||
|
"\n",
|
||||||
|
" misleading_labels = tf.zeros((batch_size, 1))\n",
|
||||||
|
"\n",
|
||||||
|
" with tf.GradientTape() as tape:\n",
|
||||||
|
" predictions = self.discriminator(self.generator(random_latent_vectors))\n",
|
||||||
|
" g_loss = self.loss_fn(misleading_labels, predictions)\n",
|
||||||
|
" grads = tape.gradient(g_loss, self.generator.trainable_weights)\n",
|
||||||
|
" self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))\n",
|
||||||
|
"\n",
|
||||||
|
" self.d_loss_metric.update_state(d_loss)\n",
|
||||||
|
" self.g_loss_metric.update_state(g_loss)\n",
|
||||||
|
" return {\"d_loss\": self.d_loss_metric.result(), \"g_loss\": self.g_loss_metric.result()}"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A callback to sample generated images during training**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"class GANMonitor(keras.callbacks.Callback):\n",
|
||||||
|
" def __init__(self, num_img=3, latent_dim=128):\n",
|
||||||
|
" self.num_img = num_img\n",
|
||||||
|
" self.latent_dim = latent_dim\n",
|
||||||
|
"\n",
|
||||||
|
" def on_epoch_end(self, epoch, logs=None):\n",
|
||||||
|
" random_latent_vectors = tf.random.normal(shape=(self.num_img, self.latent_dim))\n",
|
||||||
|
" generated_images = self.model.generator(random_latent_vectors)\n",
|
||||||
|
" generated_images *= 255\n",
|
||||||
|
" generated_images.numpy()\n",
|
||||||
|
" for i in range(self.num_img):\n",
|
||||||
|
" img = keras.preprocessing.image.array_to_img(generated_images[i])\n",
|
||||||
|
" img.save(f\"generated_img_{epoch:03d}_{i}.png\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Compiling and training the GAN**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"epochs = 100\n",
|
||||||
|
"\n",
|
||||||
|
"gan = GAN(discriminator=discriminator, generator=generator, latent_dim=latent_dim)\n",
|
||||||
|
"gan.compile(\n",
|
||||||
|
" d_optimizer=keras.optimizers.Adam(learning_rate=0.0001),\n",
|
||||||
|
" g_optimizer=keras.optimizers.Adam(learning_rate=0.0001),\n",
|
||||||
|
" loss_fn=keras.losses.BinaryCrossentropy(),\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"gan.fit(\n",
|
||||||
|
" dataset, epochs=epochs, callbacks=[GANMonitor(num_img=10, latent_dim=latent_dim)]\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Wrapping up"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter12_part05_gans.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
465
chapter13_best-practices-for-the-real-world.ipynb
Normal file
465
chapter13_best-practices-for-the-real-world.ipynb
Normal file
@@ -0,0 +1,465 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Best practices for the real world"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Getting the most our of your models"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Hyperparameter optimization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Using KerasTuner"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!pip install keras-tuner -q"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A KerasTuner model-building function**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras import layers\n",
|
||||||
|
"\n",
|
||||||
|
"def build_model(hp):\n",
|
||||||
|
" units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n",
|
||||||
|
" model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(units, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(10, activation=\"softmax\")\n",
|
||||||
|
" ])\n",
|
||||||
|
" optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n",
|
||||||
|
" model.compile(\n",
|
||||||
|
" optimizer=optimizer,\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
" return model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**A KerasTuner HyperModel**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import kerastuner as kt\n",
|
||||||
|
"\n",
|
||||||
|
"class SimpleMLP(kt.HyperModel):\n",
|
||||||
|
" def __init__(self, num_classes):\n",
|
||||||
|
" self.num_classes = num_classes\n",
|
||||||
|
"\n",
|
||||||
|
" def build(self, hp):\n",
|
||||||
|
" units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n",
|
||||||
|
" model = keras.Sequential([\n",
|
||||||
|
" layers.Dense(units, activation=\"relu\"),\n",
|
||||||
|
" layers.Dense(self.num_classes, activation=\"softmax\")\n",
|
||||||
|
" ])\n",
|
||||||
|
" optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n",
|
||||||
|
" model.compile(\n",
|
||||||
|
" optimizer=optimizer,\n",
|
||||||
|
" loss=\"sparse_categorical_crossentropy\",\n",
|
||||||
|
" metrics=[\"accuracy\"])\n",
|
||||||
|
" return model\n",
|
||||||
|
"\n",
|
||||||
|
"hypermodel = SimpleMLP(num_classes=10)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"tuner = kt.BayesianOptimization(\n",
|
||||||
|
" build_model,\n",
|
||||||
|
" objective=\"val_accuracy\",\n",
|
||||||
|
" max_trials=100,\n",
|
||||||
|
" executions_per_trial=2,\n",
|
||||||
|
" directory=\"mnist_kt_test\",\n",
|
||||||
|
" overwrite=True,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"tuner.search_space_summary()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()\n",
|
||||||
|
"x_train = x_train.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n",
|
||||||
|
"x_test = x_test.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n",
|
||||||
|
"x_train_full = x_train[:]\n",
|
||||||
|
"y_train_full = y_train[:]\n",
|
||||||
|
"num_val_samples = 10000\n",
|
||||||
|
"x_train, x_val = x_train[:-num_val_samples], x_train[-num_val_samples:]\n",
|
||||||
|
"y_train, y_val = y_train[:-num_val_samples], y_train[-num_val_samples:]\n",
|
||||||
|
"callbacks = [\n",
|
||||||
|
" keras.callbacks.EarlyStopping(monitor=\"val_loss\", patience=5),\n",
|
||||||
|
"]\n",
|
||||||
|
"tuner.search(\n",
|
||||||
|
" x_train, y_train,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" epochs=100,\n",
|
||||||
|
" validation_data=(x_val, y_val),\n",
|
||||||
|
" callbacks=callbacks,\n",
|
||||||
|
" verbose=2,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"**Querying the best hyperparameter configurations**"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"top_n = 4\n",
|
||||||
|
"best_hps = tuner.get_best_hyperparameters(top_n)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def get_best_epoch(hp):\n",
|
||||||
|
" model = build_model(hp)\n",
|
||||||
|
" callbacks=[\n",
|
||||||
|
" keras.callbacks.EarlyStopping(\n",
|
||||||
|
" monitor=\"val_loss\", mode=\"min\", patience=10)\n",
|
||||||
|
" ]\n",
|
||||||
|
" history = model.fit(\n",
|
||||||
|
" x_train, y_train,\n",
|
||||||
|
" validation_data=(x_val, y_val),\n",
|
||||||
|
" epochs=100,\n",
|
||||||
|
" batch_size=128,\n",
|
||||||
|
" callbacks=callbacks)\n",
|
||||||
|
" val_loss_per_epoch = history.history[\"val_loss\"]\n",
|
||||||
|
" best_epoch = val_loss_per_epoch.index(min(val_loss_per_epoch)) + 1\n",
|
||||||
|
" print(f\"Best epoch: {best_epoch}\")\n",
|
||||||
|
" return best_epoch"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"def get_best_trained_model(hp):\n",
|
||||||
|
" best_epoch = get_best_epoch(hp)\n",
|
||||||
|
" model.fit(\n",
|
||||||
|
" x_train_full, y_train_full,\n",
|
||||||
|
" batch_size=128, epochs=int(best_epoch * 1.2))\n",
|
||||||
|
" return model\n",
|
||||||
|
"\n",
|
||||||
|
"best_models = []\n",
|
||||||
|
"for hp in best_hps:\n",
|
||||||
|
" model = get_best_trained_model(hp)\n",
|
||||||
|
" model.evaluate(x_test, y_test)\n",
|
||||||
|
" best_models.append(model)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"best_models = tuner.get_best_models(top_n)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### The art of crafting the right search space"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### The future of hyperparameter tuning: automated machine learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Model ensembling"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Scaling up model training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Speeding up training on GPU with Mixed Precision"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Understanding floating-point precision"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tensorflow as tf\n",
|
||||||
|
"import numpy as np\n",
|
||||||
|
"np_array = np.zeros((2, 2))\n",
|
||||||
|
"tf_tensor = tf.convert_to_tensor(np_array)\n",
|
||||||
|
"tf_tensor.dtype"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"np_array = np.zeros((2, 2))\n",
|
||||||
|
"tf_tensor = tf.convert_to_tensor(np_array, dtype=\"float32\")\n",
|
||||||
|
"tf_tensor.dtype"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Mixed-precision training in practice"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"keras.mixed_precision.set_global_policy(\"mixed_float16\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Multi-GPU training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Getting your hands on two or more GPUs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Single-host, multi-device synchronous training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### TPU training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Using a TPU via Google Colab"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Leveraging step fusing to improve TPU utilization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Chapter summary"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter13_best-practices-for-the-real-world.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
568
chapter14_conclusions.ipynb
Normal file
568
chapter14_conclusions.ipynb
Normal file
@@ -0,0 +1,568 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Conclusions"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Key concepts in review"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Various approaches to AI"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### What makes deep learning special within the field of machine learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### How to think about deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Key enabling technologies"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The universal machine-learning workflow"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Key network architectures"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Densely-connected networks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from tensorflow import keras\n",
|
||||||
|
"from tensorflow.keras\u00a0import\u00a0layers\n",
|
||||||
|
"inputs = keras.Input(shape=(num_input_features,))\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"outputs = layers.Dense(1,\u00a0activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(num_input_features,))\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"outputs = layers.Dense(num_classes,\u00a0activation=\"softmax\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"categorical_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(num_input_features,))\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(num_input_features,))\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"outputs layers.Dense(num_values)(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"mse\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Convnets"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(height,\u00a0width,\u00a0channels))\n",
|
||||||
|
"x = layers.SeparableConv2D(32,\u00a03,\u00a0activation=\"relu\")(inputs)\n",
|
||||||
|
"x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.MaxPooling2D(2)(x)\n",
|
||||||
|
"x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.SeparableConv2D(128,\u00a03,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.MaxPooling2D(2)(x)\n",
|
||||||
|
"x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.SeparableConv2D(128,\u00a03,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"x = layers.GlobalAveragePooling2D()(x)\n",
|
||||||
|
"x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n",
|
||||||
|
"outputs = layers.Dense(num_classes,\u00a0activation=\"softmax\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"categorical_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### RNNs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(num_timesteps,\u00a0num_features))\n",
|
||||||
|
"x = layers.LSTM(32)(inputs)\n",
|
||||||
|
"outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(num_timesteps,\u00a0num_features))\n",
|
||||||
|
"x = layers.LSTM(32,\u00a0return_sequences=True)(inputs)\n",
|
||||||
|
"x = layers.LSTM(32,\u00a0return_sequences=True)(x)\n",
|
||||||
|
"x = layers.LSTM(32)(x)\n",
|
||||||
|
"outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Transformers"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"encoder_inputs = keras.Input(shape=(sequence_length,), dtype=\"int64\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)\n",
|
||||||
|
"encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
|
||||||
|
"decoder_inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)\n",
|
||||||
|
"x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)\n",
|
||||||
|
"decoder_outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
|
||||||
|
"transformer = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)\n",
|
||||||
|
"transformer.compile(optimizer=\"rmsprop\", loss=\"categorical_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 0,
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "code"
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inputs = keras.Input(shape=(sequence_length,), dtype=\"int64\")\n",
|
||||||
|
"x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n",
|
||||||
|
"x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
|
||||||
|
"x = layers.GlobalMaxPooling1D()(x)\n",
|
||||||
|
"outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n",
|
||||||
|
"model = keras.Model(inputs, outputs)\n",
|
||||||
|
"model.compile(optimizer=\"rmsprop\", loss=\"binary_crossentropy\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The space of possibilities"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## The limitations of deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The risk of anthropomorphizing machine-learning models"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Automatons vs. intelligent agents"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Local generalization vs. extreme generalization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The purpose of intelligence"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Climbing the spectrum of generalization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Setting the course towards greater generality in AI"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### On the importance of setting the right objective: the shortcut rule"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### A new target"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Implementing intelligence: the missing ingredients"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Intelligence as sensitivity to abstract analogies"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The two poles of abstraction"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Value-centric analogy"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Program-centric analogy"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Cognition as a combination of both kinds of abstraction"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The missing half of the picture"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## The future of deep learning"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Models as programs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Blending together deep learning and program synthesis"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Integrating deep learning modules and algorithmic modules into hybrid systems"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"#### Using deep learning to guide program search"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Lifelong learning and modular subroutine reuse"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### The long-term vision"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Staying up to date in a fast-moving field"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Practice on real-world problems using Kaggle"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Read about the latest developments on arXiv"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### Explore the Keras ecosystem"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"colab_type": "text"
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Final words"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"colab": {
|
||||||
|
"collapsed_sections": [],
|
||||||
|
"name": "chapter14_conclusions.i",
|
||||||
|
"private_outputs": false,
|
||||||
|
"provenance": [],
|
||||||
|
"toc_visible": true
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.7.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user