388 lines
30 KiB
Plaintext
388 lines
30 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c2308ffe",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Инициализация Keras\n",
|
||
"\n",
|
||
"torch был заменен на jax, так как с torch рекуррентные сети не работали"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "507915ea",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"3.9.2\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import os\n",
|
||
"\n",
|
||
"os.environ[\"KERAS_BACKEND\"] = \"jax\"\n",
|
||
"import keras\n",
|
||
"\n",
|
||
"print(keras.__version__)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8e4a9a71",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Загрузка данных для классификации с помощью глубоких сетей\n",
|
||
"\n",
|
||
"В качестве набора данных используется набор отзывов к фильмам с сайта IMDB.\n",
|
||
"\n",
|
||
"Набор включает 50 000 отзывов, половина из которых находится в обучающем наборе данных (x_train), а половина - в тестовом (x_valid). \n",
|
||
"\n",
|
||
"Метки (y_train и y_valid) имеют бинарный характер и назначены в соответствии с этими 10-балльными оценками:\n",
|
||
"- отзывы с четырьмя звездами или меньше считаются отрицательным (y = 0);\n",
|
||
"- отзывы с семью звездами или больше считаются положительными (y = 1);\n",
|
||
"- умеренные отзывы — с пятью или шестью звездами — не включались в набор данных, что упрощает задачу бинарной классификации.\n",
|
||
"\n",
|
||
"Данные уже предобработаны для простоты работы с ними.\n",
|
||
"\n",
|
||
"unique_words - в векторное пространство включается только слова, которые встречаются в корпусе не менее 10 000 раз.\n",
|
||
"\n",
|
||
"max_length - максимальная длина отзыва (если больше, то обрезается, если меньше, то дополняется \"пустыми\" словами)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "e0043e5c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from keras.api.datasets import imdb\n",
|
||
"import os\n",
|
||
"\n",
|
||
"unique_words = 10000\n",
|
||
"max_length = 100\n",
|
||
"\n",
|
||
"output_dir = \"tmp\"\n",
|
||
"if not os.path.exists(output_dir):\n",
|
||
" os.makedirs(output_dir)\n",
|
||
"\n",
|
||
"(X_train, y_train), (X_valid, y_valid) = imdb.load_data(num_words=unique_words)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c58423e9",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Приведение отзывов к длине max_length (100)\n",
|
||
"\n",
|
||
"padding и truncating - дополнение и обрезка отзывов начинается с начала (учитывается специфика затухания градиента в рекуррентных сетях)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "131e125a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from keras.api.preprocessing.sequence import pad_sequences\n",
|
||
"\n",
|
||
"X_train = pad_sequences(X_train, maxlen=max_length, padding=\"pre\", truncating=\"pre\", value=0)\n",
|
||
"X_valid = pad_sequences(X_valid, maxlen=max_length, padding=\"pre\", truncating=\"pre\", value=0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7db364f4",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Формирование архитектуры глубокой рекуррентной двунаправленной LSTM сети\n",
|
||
"\n",
|
||
"\n",
|
||
"Первый слой (Embedding) выполняет векторизацию"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "1e3fb0ec",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential\"</span>\n",
|
||
"</pre>\n"
|
||
],
|
||
"text/plain": [
|
||
"\u001b[1mModel: \"sequential\"\u001b[0m\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
|
||
"┃<span style=\"font-weight: bold\"> Layer (type) </span>┃<span style=\"font-weight: bold\"> Output Shape </span>┃<span style=\"font-weight: bold\"> Param # </span>┃\n",
|
||
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
|
||
"│ embedding (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Embedding</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">100</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">640,000</span> │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ spatial_dropout1d │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">100</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
|
||
"│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">SpatialDropout1D</span>) │ │ │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ bidirectional (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Bidirectional</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">512</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">657,408</span> │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ dense (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">1</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">513</span> │\n",
|
||
"└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
|
||
"</pre>\n"
|
||
],
|
||
"text/plain": [
|
||
"┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
|
||
"┃\u001b[1m \u001b[0m\u001b[1mLayer (type) \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m Param #\u001b[0m\u001b[1m \u001b[0m┃\n",
|
||
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
|
||
"│ embedding (\u001b[38;5;33mEmbedding\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m100\u001b[0m, \u001b[38;5;34m64\u001b[0m) │ \u001b[38;5;34m640,000\u001b[0m │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ spatial_dropout1d │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m100\u001b[0m, \u001b[38;5;34m64\u001b[0m) │ \u001b[38;5;34m0\u001b[0m │\n",
|
||
"│ (\u001b[38;5;33mSpatialDropout1D\u001b[0m) │ │ │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ bidirectional (\u001b[38;5;33mBidirectional\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m512\u001b[0m) │ \u001b[38;5;34m657,408\u001b[0m │\n",
|
||
"├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
|
||
"│ dense (\u001b[38;5;33mDense\u001b[0m) │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m1\u001b[0m) │ \u001b[38;5;34m513\u001b[0m │\n",
|
||
"└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,297,921</span> (4.95 MB)\n",
|
||
"</pre>\n"
|
||
],
|
||
"text/plain": [
|
||
"\u001b[1m Total params: \u001b[0m\u001b[38;5;34m1,297,921\u001b[0m (4.95 MB)\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,297,921</span> (4.95 MB)\n",
|
||
"</pre>\n"
|
||
],
|
||
"text/plain": [
|
||
"\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m1,297,921\u001b[0m (4.95 MB)\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
|
||
"</pre>\n"
|
||
],
|
||
"text/plain": [
|
||
"\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"from keras.api.models import Sequential\n",
|
||
"from keras.api.layers import InputLayer, Embedding, SpatialDropout1D, LSTM, Bidirectional, Dense\n",
|
||
"\n",
|
||
"blstm_model = Sequential()\n",
|
||
"blstm_model.add(InputLayer(shape=(max_length,), dtype=\"float32\"))\n",
|
||
"blstm_model.add(Embedding(unique_words, 64))\n",
|
||
"blstm_model.add(SpatialDropout1D(0.2))\n",
|
||
"blstm_model.add(Bidirectional(LSTM(256, dropout=0.2)))\n",
|
||
"blstm_model.add(Dense(1, activation=\"sigmoid\"))\n",
|
||
"\n",
|
||
"blstm_model.summary()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3a826105",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Обучение модели\n",
|
||
"\n",
|
||
"Веса модели сохраняются в каталог tmp после каждой эпохи обучения с помощью callback-параметра\n",
|
||
"\n",
|
||
"В дальнейшем веса можно загрузить"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "11236198",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Epoch 1/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m134s\u001b[0m 682ms/step - accuracy: 0.6565 - loss: 0.6039 - val_accuracy: 0.8432 - val_loss: 0.3756\n",
|
||
"Epoch 2/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m166s\u001b[0m 848ms/step - accuracy: 0.8841 - loss: 0.2820 - val_accuracy: 0.8425 - val_loss: 0.3577\n",
|
||
"Epoch 3/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m176s\u001b[0m 902ms/step - accuracy: 0.9148 - loss: 0.2238 - val_accuracy: 0.8459 - val_loss: 0.3929\n",
|
||
"Epoch 4/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m171s\u001b[0m 875ms/step - accuracy: 0.9375 - loss: 0.1744 - val_accuracy: 0.8434 - val_loss: 0.3572\n",
|
||
"Epoch 5/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m155s\u001b[0m 790ms/step - accuracy: 0.9466 - loss: 0.1520 - val_accuracy: 0.8385 - val_loss: 0.4029\n",
|
||
"Epoch 6/6\n",
|
||
"\u001b[1m196/196\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m158s\u001b[0m 807ms/step - accuracy: 0.9584 - loss: 0.1172 - val_accuracy: 0.8337 - val_loss: 0.4419\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<keras.src.callbacks.history.History at 0x3455e57f0>"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from keras.api.callbacks import ModelCheckpoint\n",
|
||
"\n",
|
||
"blstm_model.compile(\n",
|
||
" loss=\"binary_crossentropy\",\n",
|
||
" optimizer=\"adam\",\n",
|
||
" metrics=[\"accuracy\"],\n",
|
||
")\n",
|
||
"\n",
|
||
"blstm_model.fit(\n",
|
||
" X_train,\n",
|
||
" y_train,\n",
|
||
" batch_size=128,\n",
|
||
" epochs=6,\n",
|
||
" validation_data=(X_valid, y_valid),\n",
|
||
" callbacks=[ModelCheckpoint(filepath=output_dir + \"/blstm_weights.{epoch:02d}.keras\")],\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a47a8ff6",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Загрузка лучшей модели и оценка ее качества\n",
|
||
"\n",
|
||
"Качество модели - 84.6 %."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "94987771",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"\u001b[1m782/782\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m86s\u001b[0m 110ms/step - accuracy: 0.8449 - loss: 0.3976\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"[0.3929494023323059, 0.8458799719810486]"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"blstm_model.load_weights(output_dir + \"/blstm_weights.03.keras\")\n",
|
||
"blstm_model.evaluate(X_valid, y_valid)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7001f712",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Визуализация распределения вероятностей результатов модели на валидационной выборке"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "8965a612",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"\u001b[1m782/782\u001b[0m \u001b[32m━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[37m\u001b[0m \u001b[1m85s\u001b[0m 108ms/step\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjAAAAGdCAYAAAAMm0nCAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAh20lEQVR4nO3de3BU5f3H8U8u7CZgNuHS3GqEiFUuogjUuCJ4yxAlWqhYZYhIFcFLsIXMgFAxKCjBiIggQkERnAYBO2KRYCANBUYIlwZSETBqAaGlG3QgWUDJ9fz+4McpK0HZNMnyhPdrZmfCOc8u331E9+3JLgmyLMsSAACAQYIDPQAAAIC/CBgAAGAcAgYAABiHgAEAAMYhYAAAgHEIGAAAYBwCBgAAGIeAAQAAxgkN9ACNpba2VocPH1ZERISCgoICPQ4AALgAlmXp+PHjio+PV3Dw+a+zNNuAOXz4sBISEgI9BgAAqIdDhw7p8ssvP+/5ZhswERERkk5vgMvlCvA0ABpM9Unpg/jTX993WAptFdh5ADQor9erhIQE+3X8fJptwJz5tpHL5SJggOakOkRq+f9fu1wEDNBM/dTbP3gTLwAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjBMa6AFM1GF8bqBH8NuBaamBHgEAgAZDwAAAEGD8j7H/+BYSAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMI5fAVNTU6PnnntOiYmJCg8PV8eOHTVlyhRZlmWvsSxLmZmZiouLU3h4uJKTk/Xll1/6PM7Ro0eVlpYml8ulqKgoDR8+XCdOnPBZ8+mnn6pPnz4KCwtTQkKCsrOz/4enCQAAmhO/Aubll1/W3Llz9cYbb2jv3r16+eWXlZ2drdmzZ9trsrOzNWvWLM2bN09bt25Vq1atlJKSolOnTtlr0tLStHv3buXn52vVqlXauHGjRo4caZ/3er3q16+f2rdvr6KiIr3yyit6/vnnNX/+/AZ4ygAAwHSh/izevHmzBgwYoNTUVElShw4d9N5772nbtm2STl99mTlzpiZOnKgBAwZIkt59913FxMToww8/1ODBg7V3717l5eVp+/bt6tWrlyRp9uzZ6t+/v6ZPn674+Hjl5OSosrJSCxculMPhUNeuXVVcXKwZM2b4hA4AALg0+XUF5uabb1ZBQYG++OILSdI//vEPffLJJ7r77rslSfv375fH41FycrJ9n8jISCUlJamwsFCSVFhYqKioKDteJCk5OVnBwcHaunWrvaZv375yOBz2mpSUFJWUlOjYsWN1zlZRUSGv1+tzAwAAzZNfV2DGjx8vr9erTp06KSQkRDU1NXrppZeUlpYmSfJ4PJKkmJgYn/vFxMTY5zwej6Kjo32HCA1VmzZtfNYkJiae8xhnzrVu3fqc2bKysvTCCy/483QAAICh/LoCs3z5cuXk5GjJkiXasWOHFi9erOnTp2vx4sWNNd8FmzBhgsrLy+3boUOHAj0SAABoJH5dgRk7dqzGjx+vwYMHS5K6deumr7/+WllZWRo2bJhiY2MlSaWlpYqLi7PvV1paqu7du0uSYmNjdeTIEZ/Hra6u1tGjR+37x8bGqrS01GfNmV+fWfNDTqdTTqfTn6cDAAAM5dcVmO+++07Bwb53CQkJUW1trSQpMTFRsbGxKigosM97vV5t3bpVbrdbkuR2u1VWVqaioiJ7zbp161RbW6ukpCR7zcaNG1VVVWWvyc/P1zXXXFPnt48AAMClxa+Auffee/XSSy8pNzdXBw4c0IoVKzRjxgz9+te/liQFBQVp9OjRevHFF7Vy5Urt2rVLDz/8sOLj4zVw4EBJUufOnXXXXXdpxIgR2rZtmzZt2qRRo0Zp8ODBio+PlyQNGTJEDodDw4cP1+7du7Vs2TK9/vrrysjIaNhnDwAAjOTXt5Bmz56t5557Tk899ZSOHDmi+Ph4Pf7448rMzLTXjBs3TidPntTIkSNVVlamW265RXl5eQoLC7PX5OTkaNSoUbrzzjsVHBysQYMGadasWfb5yMhIrV27Vunp6erZs6fatWunzMxMPkINAAAkSUHW2X+NbjPi9XoVGRmp8vJyuVyuBn3sDuNzG/TxmsKBaamBHgFoGNUnpeWXnf76gRNSaKvAzgM0AF5X/utCX7/5WUgAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOP4HTD//ve/9dBDD6lt27YKDw9Xt27d9Pe//90+b1mWMjMzFRcXp/DwcCUnJ+vLL7/0eYyjR48qLS1NLpdLUVFRGj58uE6cOOGz5tNPP1WfPn0UFhamhIQEZWdn1/MpAgCA5savgDl27Jh69+6tFi1a6OOPP9aePXv06quvqnXr1vaa7OxszZo1S/PmzdPWrVvVqlUrpaSk6NSpU/aatLQ07d69W/n5+Vq1apU2btyokSNH2ue9Xq/69eun9u3bq6ioSK+88oqef/55zZ8/vwGeMgAAMF2oP4tffvllJSQk6J133rGPJSYm2l9blqWZM2dq4sSJGjBggCTp3XffVUxMjD788EMNHjxYe/fuVV5enrZv365evXpJkmbPnq3+/ftr+vTpio+PV05OjiorK7Vw4UI5HA517dpVxcXFmjFjhk/oAACAS5NfV2BWrlypXr166Te/+Y2io6N1ww03aMGCBfb5/fv3y+PxKDk52T4WGRmppKQkFRYWSpIKCwsVFRVlx4skJScnKzg4WFu3brXX9O3bVw6Hw16TkpKikpISHTt2rM7ZKioq5PV6fW4AAKB58itg9u3bp7lz5+oXv/iF1qxZoyeffFK/+93vtHjxYkmSx+ORJMXExPjcLyYmxj7n8XgUHR3tcz40NFRt2rTxWVPXY5z9e/xQVlaWIiMj7VtCQoI/Tw0AABjEr4Cpra1Vjx49NHXqVN1www0aOXKkRowYoXnz5jXWfBdswoQJKi8vt2+HDh0K9EgAAKCR+BUwcXFx6tKli8+xzp076+DBg5Kk2NhYSVJpaanPmtLSUvtcbGysjhw54nO+urpaR48e9VlT12Oc/Xv8kNPplMvl8rkBAIDmya+A6d27t0pKSnyOffHFF2rfvr2k02/ojY2NVUFBgX3e6/Vq69atcrvdkiS3262ysjIVFRXZa9atW6fa2lolJSXZazZu3Kiqqip7TX5+vq655hqfTzwBAIBLk18BM2bMGG3ZskVTp07VV199pSVLlmj+/PlKT0+XJAUFBWn06NF68cUXtXLlSu3atUsPP/yw4uPjNXDgQEmnr9jcddddGjFihLZt26ZNmzZp1KhRGjx4sOLj4yVJQ4YMkcPh0PDhw7V7924tW7ZMr7/+ujIyMhr22QMAACP59THqX/7yl1qxYoUmTJigyZMnKzExUTNnzlRaWpq9Zty4cTp58qRGjhypsrIy3XLLLcrLy1NYWJi9JicnR6NGjdKdd96p4OBgDRo0SLNmzbLPR0ZGau3atUpPT1fPnj3Vrl07ZWZm8hFqAAAgSQqyLMsK9BCNwev1KjIyUuXl5Q3+fpgO43Mb9PGawoFpqYEeAWgY1Sel5Zed/vqBE1Joq8DOAzQAXlf+60Jfv/lZSAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOP9TwEybNk1BQUEaPXq0fezUqVNKT09X27Ztddlll2nQoEEqLS31ud/BgweVmpqqli1bKjo6WmPHjlV1dbXPmvXr16tHjx5yOp266qqrtGjRov9lVAAA0IzUO2C2b9+uP/7xj7ruuut8jo8ZM0YfffSR3n//fW3YsEGHDx/WfffdZ5+vqalRamqqKisrtXnzZi1evFiLFi1SZmamvWb//v1KTU3V7bffruLiYo0ePVqPPfaY1qxZU99xAQBAM1KvgDlx4oTS0tK0YMECtW7d2j5eXl6ut99+WzNmzNAdd9yhnj176p133tHmzZu1ZcsWSdLatWu1Z88e/elPf1L37t119913a8qUKZozZ44qKyslSfPmzVNiYqJeffVVde7cWaNGjdL999+v1157rQGeMgAAMF29AiY9PV2pqalKTk72OV5UVKSqqiqf4506ddIVV1yhwsJCSVJhYaG6deummJgYe01KSoq8Xq92795tr/nhY6ekpNiPUZeKigp5vV6fGwAAaJ5C/b3D0qVLtWPHDm3fvv2ccx6PRw6HQ1FRUT7HY2Ji5PF47DVnx8uZ82fO/dgar9er77//XuHh4ef83llZWXrhhRf8fToAAMBAfl2BOXTokH7/+98rJydHYWFhjTVTvUyYMEHl5eX27dChQ4EeCQAANBK/AqaoqEhHjhxRjx49FBoaqtDQUG3YsEGzZs1SaGioYmJiVFlZqbKyMp/7lZaWKjY2VpIUGxt7zqeSzvz6p9a4XK46r75IktPplMvl8rkBAIDmya+AufPOO7Vr1y4VFxfbt169eiktLc3+ukWLFiooKLDvU1JSooMHD8rtdkuS3G63du3apSNHjthr8vPz5XK51KVLF3vN2Y9xZs2ZxwAAAJc2v94DExERoWuvvdbnWKtWrdS2bVv7+PDhw5WRkaE2bdrI5XLp6aefltvt1k033SRJ6tevn7p06aKhQ4cqOztbHo9HEydOVHp6upxOpyTpiSee0BtvvKFx48bp0Ucf1bp167R8+XLl5uY2xHMGAACG8/tNvD/ltddeU3BwsAYNGqSKigqlpKTozTfftM+HhIRo1apVevLJJ+V2u9WqVSsNGzZMkydPttckJiYqNzdXY8aM0euvv67LL79cb731llJSUhp6XAAAYKAgy7KsQA/RGLxeryIjI1VeXt7g74fpMN68K0EHpqUGegSgYVSflJZfdvrrB05Ioa0COw/QAHhd+a8Lff3mZyEBAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIzjV8BkZWXpl7/8pSIiIhQdHa2BAweqpKTEZ82pU6eUnp6utm3b6rLLLtOgQYNUWlrqs+bgwYNKTU1Vy5YtFR0drbFjx6q6utpnzfr169WjRw85nU5dddVVWrRoUf2eIQAAaHb8CpgNGzYoPT1dW7ZsUX5+vqqqqtSvXz+dPHnSXjNmzBh99NFHev/997VhwwYdPnxY9913n32+pqZGqampqqys1ObNm7V48WItWrRImZmZ9pr9+/crNTVVt99+u4qLizV69Gg99thjWrNmTQM8ZQAAYLogy7Ks+t75m2++UXR0tDZs2KC+ffuqvLxcP/vZz7RkyRLdf//9kqTPP/9cnTt3VmFhoW666SZ9/PHHuueee3T48GHFxMRIkubNm6dnnnlG33zzjRwOh5555hnl5ubqs88+s3+vwYMHq6ysTHl5eRc0m9frVWRkpMrLy+Vyuer7FOvUYXxugz5eUzgwLTXQIwANo/qktPyy018/cEIKbRXYeYAGwOvKf13o6/f/9B6Y8vJySVKbNm0kSUVFRaqqqlJycrK9plOnTrriiitUWFgoSSosLFS3bt3seJGklJQUeb1e7d69215z9mOcWXPmMepSUVEhr9frcwMAAM1TvQOmtrZWo0ePVu/evXXttddKkjwejxwOh6KionzWxsTEyOPx2GvOjpcz58+c+7E1Xq9X33//fZ3zZGVlKTIy0r4lJCTU96kBAICLXL0DJj09XZ999pmWLl3akPPU24QJE1ReXm7fDh06FOiRAABAIwmtz51GjRqlVatWaePGjbr88svt47GxsaqsrFRZWZnPVZjS0lLFxsbaa7Zt2+bzeGc+pXT2mh9+cqm0tFQul0vh4eF1zuR0OuV0OuvzdAAAgGH8ugJjWZZGjRqlFStWaN26dUpMTPQ537NnT7Vo0UIFBQX2sZKSEh08eFBut1uS5Ha7tWvXLh05csRek5+fL5fLpS5duthrzn6MM2vOPAYAALi0+XUFJj09XUuWLNFf/vIXRURE2O9ZiYyMVHh4uCIjIzV8+HBlZGSoTZs2crlcevrpp+V2u3XTTTdJkvr166cuXbpo6NChys7Olsfj0cSJE5Wenm5fQXniiSf0xhtvaNy4cXr00Ue1bt06LV++XLm55r1LGwAANDy/rsDMnTtX5eXluu222xQXF2ffli1bZq957bXXdM8992jQoEHq27evYmNj9cEHH9jnQ0JCtGrVKoWEhMjtduuhhx7Sww8/rMmTJ9trEhMTlZubq/z8fF1//fV69dVX9dZbbyklJaUBnjIAADCdX1dgLuSvjAkLC9OcOXM0Z86c865p3769Vq9e/aOPc9ttt2nnzp3+jAcAAC4R/CwkAABgHAIGAAAYh4ABAADGIWAAAIBxCBgAAGAcAgYAABiHgAEAAMYhYAAAgHEIGAAAYBwCBgAAGIeAAQAAxiFgAACAcQgYAABgHAIGAAAYh4ABAADGIWAAAIBxCBgAAGAcAgYAABiHgAEAAMYhYAAAgHFCAz0AAAANqcP43ECPgCbAFRgAAGAcAgYAABiHgAEAAMbhPTCXCFO/J3xgWmqgRwAAXIS4AgMAAIxDwAAAAOMQMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADj8KMEAADnZeqPIUHzxxUYAABgHAIGAAAYh4ABAADGIWAAAIBxCBgAAGAcAgYAABiHj1HjombiRzgPTEsN9AgA0OxxBQYAABiHgAEAAMYhYAAAgHF4DwzQwHjfDs7HxD8bwMWKgAFgrM7P5el7KyzQYwAIAL6FBAAAjMMVGABGfWsjPOiU9nYL9BQAAo0rMAAAwDgEDAAAMA4BAwAAjEPAAAAA4xAwAADAOAQMAAAwDgEDAACMQ8AAAADjEDAAAMA4BAwAADAOAQMAAIxDwAAAAOMQMAAAwDgEDAAAMM5FHTBz5sxRhw4dFBYWpqSkJG3bti3QIwEAgIvARRswy5YtU0ZGhiZNmqQdO3bo+uuvV0pKio4cORLo0QAAQIBdtAEzY8YMjRgxQo888oi6dOmiefPmqWXLllq4cGGgRwMAAAEWGugB6lJZWamioiJNmDDBPhYcHKzk5GQVFhbWeZ+KigpVVFTYvy4vL5ckeb3eBp+vtuK7Bn9MABemJuiUvP//r2BNxXeqtWoDOxBwiWqM19ezH9eyrB9dd1EGzLfffquamhrFxMT4HI+JidHnn39e532ysrL0wgsvnHM8ISGhUWYEEDiR9lcPB3AK4NIWObNxH//48eOKjIw87/mLMmDqY8KECcrIyLB/XVtbq6NHj6pt27YKCgpqsN/H6/UqISFBhw4dksvlarDHxbnY66bBPjcN9rlpsM9NozH32bIsHT9+XPHx8T+67qIMmHbt2ikkJESlpaU+x0tLSxUbG1vnfZxOp5xOp8+xqKioxhpRLpeLfzmaCHvdNNjnpsE+Nw32uWk01j7/2JWXMy7KN/E6HA717NlTBQUF9rHa2loVFBTI7XYHcDIAAHAxuCivwEhSRkaGhg0bpl69eunGG2/UzJkzdfLkST3yyCOBHg0AAATYRRswDz74oL755htlZmbK4/Goe/fuysvLO+eNvU3N6XRq0qRJ53y7Cg2PvW4a7HPTYJ+bBvvcNC6GfQ6yfupzSgAAABeZi/I9MAAAAD+GgAEAAMYhYAAAgHEIGAAAYBwCpg5z5sxRhw4dFBYWpqSkJG3btu1H17///vvq1KmTwsLC1K1bN61evbqJJjWfP3u9YMEC9enTR61bt1br1q2VnJz8k/9scJq/f6bPWLp0qYKCgjRw4MDGHbCZ8Hefy8rKlJ6erri4ODmdTl199dX89+MC+LvPM2fO1DXXXKPw8HAlJCRozJgxOnXqVBNNa6aNGzfq3nvvVXx8vIKCgvThhx/+5H3Wr1+vHj16yOl06qqrrtKiRYsad0gLPpYuXWo5HA5r4cKF1u7du60RI0ZYUVFRVmlpaZ3rN23aZIWEhFjZ2dnWnj17rIkTJ1otWrSwdu3a1cSTm8ffvR4yZIg1Z84ca+fOndbevXut3/72t1ZkZKT1r3/9q4knN4u/+3zG/v37rZ///OdWnz59rAEDBjTNsAbzd58rKiqsXr16Wf3797c++eQTa//+/db69eut4uLiJp7cLP7uc05OjuV0Oq2cnBxr//791po1a6y4uDhrzJgxTTy5WVavXm09++yz1gcffGBJslasWPGj6/ft22e1bNnSysjIsPbs2WPNnj3bCgkJsfLy8hptRgLmB2688UYrPT3d/nVNTY0VHx9vZWVl1bn+gQcesFJTU32OJSUlWY8//nijztkc+LvXP1RdXW1FRERYixcvbqwRm4X67HN1dbV18803W2+99ZY1bNgwAuYC+LvPc+fOta688kqrsrKyqUZsFvzd5/T0dOuOO+7wOZaRkWH17t27UedsTi4kYMaNG2d17drV59iDDz5opaSkNNpcfAvpLJWVlSoqKlJycrJ9LDg4WMnJySosLKzzPoWFhT7rJSklJeW863Faffb6h7777jtVVVWpTZs2jTWm8eq7z5MnT1Z0dLSGDx/eFGMarz77vHLlSrndbqWnpysmJkbXXnutpk6dqpqamqYa2zj12eebb75ZRUVF9reZ9u3bp9WrV6t///5NMvOlIhCvhRft38QbCN9++61qamrO+dt+Y2Ji9Pnnn9d5H4/HU+d6j8fTaHM2B/XZ6x965plnFB8ff86/NPiv+uzzJ598orffflvFxcVNMGHzUJ993rdvn9atW6e0tDStXr1aX331lZ566ilVVVVp0qRJTTG2ceqzz0OGDNG3336rW265RZZlqbq6Wk888YT+8Ic/NMXIl4zzvRZ6vV59//33Cg8Pb/DfkyswMNK0adO0dOlSrVixQmFhYYEep9k4fvy4hg4dqgULFqhdu3aBHqdZq62tVXR0tObPn6+ePXvqwQcf1LPPPqt58+YFerRmZf369Zo6darefPNN7dixQx988IFyc3M1ZcqUQI+G/xFXYM7Srl07hYSEqLS01Od4aWmpYmNj67xPbGysX+txWn32+ozp06dr2rRp+utf/6rrrruuMcc0nr/7/M9//lMHDhzQvffeax+rra2VJIWGhqqkpEQdO3Zs3KENVJ8/z3FxcWrRooVCQkLsY507d5bH41FlZaUcDkejzmyi+uzzc889p6FDh+qxxx6TJHXr1k0nT57UyJEj9eyzzyo4mP+Pbwjney10uVyNcvVF4gqMD4fDoZ49e6qgoMA+Vltbq4KCArnd7jrv43a7fdZLUn5+/nnX47T67LUkZWdna8qUKcrLy1OvXr2aYlSj+bvPnTp10q5du1RcXGzffvWrX+n2229XcXGxEhISmnJ8Y9Tnz3Pv3r311Vdf2YEoSV988YXi4uKIl/Oozz5/991350TKmWi0+FGADSYgr4WN9vZgQy1dutRyOp3WokWLrD179lgjR460oqKiLI/HY1mWZQ0dOtQaP368vX7Tpk1WaGioNX36dGvv3r3WpEmT+Bj1BfJ3r6dNm2Y5HA7rz3/+s/Wf//zHvh0/fjxQT8EI/u7zD/EppAvj7z4fPHjQioiIsEaNGmWVlJRYq1atsqKjo60XX3wxUE/BCP7u86RJk6yIiAjrvffes/bt22etXbvW6tixo/XAAw8E6ikY4fjx49bOnTutnTt3WpKsGTNmWDt37rS+/vpry7Isa/z48dbQoUPt9Wc+Rj127Fhr79691pw5c/gYdSDMnj3buuKKKyyHw2HdeOON1pYtW+xzt956qzVs2DCf9cuXL7euvvpqy+FwWF27drVyc3ObeGJz+bPX7du3tySdc5s0aVLTD24Yf/9Mn42AuXD+7vPmzZutpKQky+l0WldeeaX10ksvWdXV1U08tXn82eeqqirr+eeftzp27GiFhYVZCQkJ1lNPPWUdO3as6Qc3yN/+9rc6/3t7Zm+HDRtm3Xrrrefcp3v37pbD4bCuvPJK65133mnUGYMsi2toAADALLwHBgAAGIeAAQAAxiFgAACAcQgYAABgHAIGAAAYh4ABAADGIWAAAIBxCBgAAGAcAgYAABiHgAEAAMYhYAAAgHEIGAAAYJz/A/Aol/9oKMUaAAAAAElFTkSuQmCC",
|
||
"text/plain": [
|
||
"<Figure size 640x480 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"plt.hist(blstm_model.predict(X_valid))\n",
|
||
"_ = plt.axvline(x=0.5, color=\"orange\")"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": ".venv (3.12.10)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.12.10"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|