ckiias/lec3-3-transfer-learning.ipynb

552 KiB
Raw Blame History

Инициализация Keras

Для ускорения обучения на GPU следует настраивать backend под конкретную ОС и модель GPU.

Для ускорения pytorch на Windows и свежей карте от NVidia следует установить вместо обычного pytorch:

torch = { version = "^2.7.0+cu128", source = "pytorch-cuda128" }
torchaudio = { version = "^2.7.0+cu128", source = "pytorch-cuda128" }
torchvision = { version = "^0.22.0+cu128", source = "pytorch-cuda128" }

Обязательно следует включить репозиторий

[[tool.poetry.source]]
name = "pytorch-cuda128"
url = "https://download.pytorch.org/whl/cu128"
priority = "explicit"

Для macOS можно использовать jax 0.5.0 (обязательно такая версия) + jax-metal 0.1.1

In [1]:
import os

os.environ["KERAS_BACKEND"] = "torch"
import keras

print(keras.__version__)
3.9.2

Загрузка набора данных для задачи классификации

В данном примере используется фрагмент набора данных Cats and Dogs Classification Dataset

В наборе данных два класса (всего 24 998 изображений): кошки (12 499 изображения) и собаки (12 499 изображения)

Ссылка: https://www.kaggle.com/datasets/bhavikjikadara/dog-and-cat-classification-dataset

In [2]:
import kagglehub
import os

path = kagglehub.dataset_download("bhavikjikadara/dog-and-cat-classification-dataset")
path = os.path.join(path, "PetImages")

Формирование выборок

Для формирования выборок используется устаревший (deprecated) класс ImageDataGenerator

Вместо него рекомендуется использовать image_dataset_from_directory (https://keras.io/api/data_loading/image/)

Для использования image_dataset_from_directory требуется tensorflow

ImageDataGenerator формирует две выборки: обучающую и валидационную (80 на 20).

В каждой выборке изображения масштабируются до размера 224 на 224 пиксела с RGB пространством.

Изображения подгружаются с диска в процессе обучения и валидации модели.

In [3]:
from keras.src.legacy.preprocessing.image import ImageDataGenerator

batch_size = 32

data_loader = ImageDataGenerator(validation_split=0.2)

train = data_loader.flow_from_directory(
    directory=path,
    target_size=(224, 224),
    color_mode="rgb",
    class_mode="binary",
    batch_size=batch_size,
    shuffle=True,
    seed=9,
    subset="training",
)

valid = data_loader.flow_from_directory(
    directory=path,
    target_size=(224, 224),
    color_mode="rgb",
    class_mode="binary",
    batch_size=batch_size,
    shuffle=True,
    seed=9,
    subset="validation",
)

train.class_indices
Found 20000 images belonging to 2 classes.
Found 4998 images belonging to 2 classes.
Out[3]:
{'Cat': 0, 'Dog': 1}

Пример переноса обучения с использованием предобученной модели VGGNet19

Загрузка предобученной модели VGG19:

  • Загрузка весов, полученных при обучении модели на наборе данных ImageNet
  • Отключение полносвязанных слоев для адаптации к новой задаче
  • Модель будет работать с изображениями 224 на 224 пиксела и RGB пространством
In [4]:
from keras.api.applications.vgg19 import VGG19

vgg19 = VGG19(include_top=False, weights="imagenet", input_shape=(224, 224, 3), pooling=None)

vgg19.trainable = False

Проектирование архитектуры ИНС на основе предобученной модели

In [5]:
from keras.api.models import Sequential
from keras.api.layers import Dropout, Flatten, Dense

tl_model = Sequential()
tl_model.add(vgg19)

# Добавление собственных слоев (в них будет проводиться обучение для текущей задачи)
tl_model.add(Flatten(name="flattened"))
tl_model.add(Dropout(0.5, name="dropout"))
tl_model.add(Dense(1, activation="sigmoid", name="predictions"))

tl_model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg19 (Functional)              │ (None, 7, 7, 512)      │    20,024,384 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flattened (Flatten)             │ (None, 25088)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 25088)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ predictions (Dense)             │ (None, 1)              │        25,089 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 20,049,473 (76.48 MB)
 Trainable params: 25,089 (98.00 KB)
 Non-trainable params: 20,024,384 (76.39 MB)

Обучение глубокой модели

Обучение остановлено после второго шага, так как качество модели приемлемое

In [6]:
tl_model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    metrics=["accuracy"],
)

tl_model.fit(x=train, validation_data=valid, epochs=5)
d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/5
 15/625 ━━━━━━━━━━━━━━━━━━━━ 19:08 2s/step - accuracy: 0.6655 - loss: 3.8879
d:\Projects\Python\mai\.venv\Lib\site-packages\PIL\TiffImagePlugin.py:900: UserWarning: Truncated File Read
  warnings.warn(str(msg))
625/625 ━━━━━━━━━━━━━━━━━━━━ 1449s 2s/step - accuracy: 0.9067 - loss: 1.1503 - val_accuracy: 0.9602 - val_loss: 0.5733
Epoch 2/5
625/625 ━━━━━━━━━━━━━━━━━━━━ 1453s 2s/step - accuracy: 0.9551 - loss: 0.6339 - val_accuracy: 0.9638 - val_loss: 0.5131
Epoch 3/5
  2/625 ━━━━━━━━━━━━━━━━━━━━ 19:54 2s/step - accuracy: 0.9297 - loss: 0.6612
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[6], line 7
      1 tl_model.compile(
      2     loss="binary_crossentropy",
      3     optimizer="adam",
      4     metrics=["accuracy"],
      5 )
----> 7 tl_model.fit(x=train, validation_data=valid, epochs=5)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\utils\traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    115 filtered_tb = None
    116 try:
--> 117     return fn(*args, **kwargs)
    118 except Exception as e:
    119     filtered_tb = _process_traceback_frames(e.__traceback__)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\backend\torch\trainer.py:257, in TorchTrainer.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq)
    253 for step, data in epoch_iterator:
    254     # Callbacks
    255     callbacks.on_train_batch_begin(step)
--> 257     logs = self.train_function(data)
    259     # Callbacks
    260     callbacks.on_train_batch_end(step, logs)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\backend\torch\trainer.py:117, in TorchTrainer.make_train_function.<locals>.one_step_on_data(data)
    115 """Runs a single training step on a batch of data."""
    116 data = data[0]
--> 117 return self.train_step(data)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\backend\torch\trainer.py:52, in TorchTrainer.train_step(self, data)
     48 # Call torch.nn.Module.zero_grad() to clear the leftover gradients
     49 # for the weights from the previous train step.
     50 self.zero_grad()
---> 52 loss = self._compute_loss(
     53     x=x, y=y, y_pred=y_pred, sample_weight=sample_weight, training=True
     54 )
     55 self._loss_tracker.update_state(
     56     loss, sample_weight=tree.flatten(x)[0].shape[0]
     57 )
     58 if self.optimizer is not None:

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\trainers\trainer.py:383, in Trainer._compute_loss(self, x, y, y_pred, sample_weight, training)
    376 """Backwards compatibility wrapper for `compute_loss`.
    377 
    378 This should be used instead `compute_loss` within `train_step` and
    379 `test_step` to support overrides of `compute_loss` that may not have
    380 the `training` argument, as this argument was added in Keras 3.3.
    381 """
    382 if self._compute_loss_has_training_arg:
--> 383     return self.compute_loss(
    384         x, y, y_pred, sample_weight, training=training
    385     )
    386 else:
    387     return self.compute_loss(x, y, y_pred, sample_weight)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\trainers\trainer.py:351, in Trainer.compute_loss(***failed resolving arguments***)
    349 losses = []
    350 if self._compile_loss is not None:
--> 351     loss = self._compile_loss(y, y_pred, sample_weight)
    352     if loss is not None:
    353         losses.append(loss)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\trainers\compile_utils.py:690, in CompileLoss.__call__(self, y_true, y_pred, sample_weight)
    688 def __call__(self, y_true, y_pred, sample_weight=None):
    689     with ops.name_scope(self.name):
--> 690         return self.call(y_true, y_pred, sample_weight)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\trainers\compile_utils.py:699, in CompileLoss.call(self, y_true, y_pred, sample_weight)
    696     self.build(y_true, y_pred)
    697 _, loss_fn, loss_weight, _ = self._flat_losses[0]
    698 loss_value = ops.cast(
--> 699     loss_fn(y_true, y_pred, sample_weight), dtype=self.dtype
    700 )
    701 if loss_weight is not None:
    702     loss_value = ops.multiply(loss_value, loss_weight)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\losses\loss.py:79, in Loss.__call__(self, y_true, y_pred, sample_weight)
     76 else:
     77     mask = None
---> 79 return reduce_weighted_values(
     80     losses,
     81     sample_weight=sample_weight,
     82     mask=mask,
     83     reduction=self.reduction,
     84     dtype=self.dtype,
     85 )

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\losses\loss.py:193, in reduce_weighted_values(values, sample_weight, mask, reduction, dtype)
    190     values = values * sample_weight
    192 # Apply reduction function to the individual weighted losses.
--> 193 loss = reduce_values(values, sample_weight, reduction)
    194 return loss

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\losses\loss.py:155, in reduce_values(values, sample_weight, reduction)
    151     divisor = ops.cast(ops.sum(sample_weight), loss.dtype)
    152 else:
    153     divisor = ops.cast(
    154         ops.prod(
--> 155             ops.convert_to_tensor(ops.shape(values), dtype="int32")
    156         ),
    157         loss.dtype,
    158     )
    159 loss = ops.divide_no_nan(loss, divisor)
    160 loss = scale_loss_for_distribution(loss)

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\ops\core.py:958, in convert_to_tensor(x, dtype, sparse, ragged)
    956 if any_symbolic_tensors((x,)):
    957     return ConvertToTensor(dtype=dtype, sparse=sparse)(x)
--> 958 return backend.core.convert_to_tensor(
    959     x, dtype=dtype, sparse=sparse, ragged=ragged
    960 )

File d:\Projects\Python\mai\.venv\Lib\site-packages\keras\src\backend\torch\core.py:236, in convert_to_tensor(x, dtype, sparse, ragged)
    232     dtype = result_type(
    233         *[getattr(item, "dtype", type(item)) for item in tree.flatten(x)]
    234     )
    235 dtype = to_torch_dtype(dtype)
--> 236 return torch.as_tensor(x, dtype=dtype, device=get_device())

KeyboardInterrupt: 

Оценка качества модели

Качество модели - 96.3 %.

In [7]:
tl_model.evaluate(valid)
157/157 ━━━━━━━━━━━━━━━━━━━━ 303s 2s/step - accuracy: 0.9664 - loss: 0.4611
Out[7]:
[0.5092828273773193, 0.9639856219291687]

Пример использования обученной модели

Для примера используются случайные изображения из сети Интернет

In [8]:
import mahotas as mh
from matplotlib import pyplot as plt

cat = mh.imread("data/-cat.jpg")
plt.imshow(cat)
plt.show()

dog = mh.imread("data/-dog.jpg")
plt.imshow(dog)
plt.show()
In [9]:
resized_cat = mh.resize.resize_rgb_to(cat, (224, 224))

resized_dog = mh.resize.resize_rgb_to(dog, (224, 224))
resized_dog.shape
Out[9]:
(224, 224, 3)
In [10]:
results = [
    1 if tl_model.predict(item.reshape(1, 224, 224, 3).astype("float32")) > 0.5 else 0
    for item in [resized_cat, resized_dog]
]

for result in results:
    display(result, list(valid.class_indices.keys())[list(valid.class_indices.values()).index(result)])
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 194ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
0
'Cat'
1
'Dog'