{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Training Panoptic Head module\n", "\n", "This tutorial explains how to use [LitPanopticDetr] module to train the [PanopticHead] architecture from scratch, using [COCO2017 panoptic annotations] and [COCO2017 detection dataset] as inputs. With that, the new architecture is able to detect boxes and masks for object tasks.\n", "\n", "
\n", " \n", "**Goals**\n", " \n", "1. Declaration of [LitPanopticDetr] and [CocoPanoptic2Detr] modules\n", "2. Run training\n", "3. Load trained weights and make inference with pre-trained weights\n", "\n", "
\n", "\n", "
\n", " \n", "**Warning**\n", " \n", "The following guide needs to download [COCO2017 panoptic annotations] and [COCO2017 detection dataset] previously. The dataset module assumes that the information is stored in the following way::\n", "\n", " coco \n", " ├── train2017 \n", " | ├── img_train_0.jpg \n", " | ├── img_train_1.jpg \n", " | ├── ... \n", " | └── img_train_L.jpg \n", " ├── valid2017 \n", " | ├── img_val_0.jpg \n", " | ├── img_val_1.jpg \n", " | ├── ... \n", " | └── img_val_M.jpg \n", " └── annotations \n", " ├── panoptic_train2017.json \n", " ├── panoptic_val2017.json \n", " ├── panoptic_train2017 \n", " | ├── img_ann_train_0.jpg \n", " | ├── img_ann_train_1.jpg \n", " | ├── ... \n", " | └── img_ann_train_L.jpg \n", " └── panoptic_val2017 \n", " ├── img_ann_val_0.jpg \n", " ├── img_ann_val_1.jpg \n", " ├── ... \n", " └── img_ann_val_M.jpg \n", "\n", "See https://cocodataset.org/#panoptic-2018 for more information about panoptic tasks.\n", "
\n", "\n", "[COCO2017 panoptic annotations]: http://images.cocodataset.org/annotations/panoptic_annotations_trainval2017.zip\n", "[COCO2017 detection dataset]: https://cocodataset.org/#detection-2017\n", "[LitPanopticDetr]: ../alonet/panoptic_training.rst#alonet.detr_panoptic.train.LitPanopticDetr\n", "[PanopticHead]: ../alonet/panoptic_models.rst\n", "[CocoPanoptic2Detr]: ../alonet/detr_connectors.rst#cocopanoptic2detr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. LitPanopticDetr and CocoPanoptic2Detr\n", "\n", "[Aloception] is developed under the [Pytorch Lightning] framework, and provides different modules that facilitate the use of datasets and training models. \n", "\n", "
\n", "\n", "**See also**\n", " \n", "All information is availabled at:\n", " \n", " * [Pytorch Lightning Module] \n", " * [Pytorch Lightning Data Module]\n", "\n", "
\n", "\n", "[LitPanopticDetr] is a module based on [LitDetr]. For this reason, the ways to instantiate the module are the same. \n", "\n", "
\n", "\n", "**See also**\n", " \n", "Previous knowledged about [how to train a Detr model](training_detr.ipynb)\n", "\n", "
\n", "\n", "On the other hand, [CocoPanoptic2Detr] follows the same logic than [CocoDetection2Detr]. Therefore, the declaration of the modules could be:\n", "\n", "[Pytorch Lightning]: https://www.pytorchlightning.ai/\n", "[Pytorch Lightning Module]: https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html\n", "[Pytorch Lightning Data Module]: https://pytorch-lightning.readthedocs.io/en/latest/extensions/datamodules.html\n", "[Aloception]: ../index.rst\n", "[LitPanopticDetr]: ../alonet/panoptic_training.rst#alonet.detr_panoptic.train.LitPanopticDetr\n", "[LitDetr]: ../alonet/detr_training.rst#alonet.detr.train.LitDetr\n", "[CocoPanoptic2Detr]: ../alonet/detr_connectors.rst#cocopanoptic2detr\n", "[CocoDetection2Detr]: ../alonet/detr_connectors.rst#cocodetection2detr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alonet.detr_panoptic import LitPanopticDetr\n", "from alonet.detr import CocoPanoptic2Detr\n", "\n", "lit_panoptic = LitPanopticDetr()\n", "coco_loader = CocoPanoptic2Detr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Important**\n", " \n", "1. By default, [LitPanopticDetr] load the [DETR50] pretrained weights\n", "2. [LitPanopticDetr] does not have `num_classes` attribute, because [PanopticHead] is a module that match with the output of a model based on [DETR], using the number of classes defined by it. Then, there are two ways to change the number of classes.\n", "\n", "
\n", "\n", "[LitPanopticDetr]: ../alonet/panoptic_training.rst#alonet.detr_panoptic.train.LitPanopticDetr\n", "[DETR50]: ../alonet/detr_models.rst#module-alonet.detr.detr_r50\n", "[DETR]: ../alonet/detr_models.rst#module-alonet.detr.detr\n", "[PanopticHead]: ../alonet/panoptic_models.rst" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Use a finetune mode on [LitPanopticDetr] declaration:\n", "\n", "[LitPanopticDetr]: ../alonet/panoptic_training.rst#alonet.detr_panoptic.train.LitPanopticDetr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alonet.detr import DetrR50Finetune\n", "from alonet.detr_panoptic import PanopticHead\n", "\n", "# Define Detr finetune model\n", "my_detr_model = DetrR50Finetune(num_classes = 2)\n", "# Uses it to create a new panoptic head model\n", "my_model = PanopticHead(DETR_module = my_detr_model)\n", "# Make the pytorch lightning module\n", "lit_panoptic = LitPanopticDetr(model_name=\"finetune\", model=my_model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Implement directly the [DetrR50PanopticFinetune] model:\n", "\n", "[DetrR50PanopticFinetune]: ../alonet/panoptic_models.rst#module-alonet.detr_panoptic.detr_r50_panoptic_finetune" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alonet.detr_panoptic import DetrR50PanopticFinetune\n", "\n", "# Define Detr+PanopticHead with custom number of classes\n", "my_model = DetrR50PanopticFinetune(num_classes = 2)\n", "# Make the pytorch lightning module\n", "lit_panoptic = LitPanopticDetr(model_name=\"finetune\", model=my_model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**See also**\n", "\n", "[detr finetune](finetuning_detr.ipynb) and [deformable detr finetune](finetuning_deformable_detr.ipynb) tutorials.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Train process\n", "\n", "
\n", "\n", "**See also**\n", "\n", "The training process is based on the [Pytorch Lightning Trainer Module]. For more information, please consult their online documentation.\n", "
\n", "\n", "In order to make an example, let's take the [COCO detection 2017 dataset] as a training base. The common pipeline is described below:\n", "\n", "[COCO detection 2017 dataset]: https://cocodataset.org/#detection-2017\n", "[Pytorch Lightning Trainer Module]: https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from argparse import ArgumentParser\n", "\n", "import alonet\n", "from alonet.detr_panoptic import LitPanopticDetr\n", "from alonet.detr import CocoPanoptic2Detr\n", "\n", "import torch\n", "\n", "device = torch.device(\"cuda:0\") if torch.cuda.is_available() else torch.device(\"cpu\")\n", "\n", "# Parameters definition\n", "# Build parser (concatenates arguments to modify the entire project)\n", "parser = ArgumentParser(conflict_handler=\"resolve\")\n", "parser = CocoPanoptic2Detr.add_argparse_args(parser)\n", "parser = LitPanopticDetr.add_argparse_args(parser)\n", "parser = alonet.common.add_argparse_args(parser) # Add common arguments in train process\n", "args = parser.parse_args([])\n", "\n", "# Dataset use to train\n", "args.batch_size = 1 # The training has a high computational memory cost. Recommended use this\n", "coco_loader = CocoPanoptic2Detr(args)\n", "lit_panoptic = LitPanopticDetr(args)\n", "\n", "# Train process\n", "# args.save = True # Uncomment this line to store trained weights\n", "lit_panoptic.run_train(\n", " data_loader=coco_loader, \n", " args=args, \n", " project=\"panoptic\", \n", " expe_name=\"coco\", \n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "**Attention**\n", "\n", "This code has a high computational cost and demands several hours of training, given its initialization from scratch. It is recommended to skip to the next section to see the results of the trained network.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Make inferences\n", "\n", "Once the training is finished, we can load the trained weights knowing the project and run id (`~/.aloception/project_run_id/run_id` path). For this, a function of the common module of aloception could be used:\n", "\n", "```python\n", "from argparse import Namespace\n", "from alonet.common import load_training\n", "\n", "args = Namespace(project_run_id = \"project_run_id\", run_id = \"run_id\")\n", "lit_panoptic = load_training(LitPanopticDetr, args = args)\n", "```\n", "\n", "Moreover, [LitPanopticDetr] allows download and load pre-trained weights for use. This is achieved by using the `weights` attribute:\n", "\n", "[LitPanopticDetr]: ../alonet/panoptic_training.rst#alonet.detr_panoptic.train.LitPanopticDetr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lit_detr = LitPanopticDetr(weights = \"detr-r50-panoptic\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we have a pre-trained model ready to make some detections." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pylab as plt\n", "plt.rcParams['figure.dpi'] = 120\n", "\n", "from alonet.detr_panoptic import LitPanopticDetr\n", "from alonet.detr import CocoPanoptic2Detr\n", "\n", "import torch\n", "\n", "device = torch.device(\"cuda:0\") if torch.cuda.is_available() else torch.device(\"cpu\")\n", "\n", "# Dataset use to train\n", "coco_loader = CocoPanoptic2Detr(batch_size = 1)\n", "lit_panoptic = LitPanopticDetr(weights = \"detr-r50-panoptic\")\n", "lit_panoptic.model = lit_panoptic.model.eval().to(device)\n", "\n", "# Check a random result\n", "frame = next(iter(coco_loader.val_dataloader()))\n", "frame = frame[0].batch_list(frame).to(device)\n", "pred_boxes, pred_masks = lit_panoptic.inference(lit_panoptic(frame))\n", "pred_boxes, pred_masks = pred_boxes[0], pred_masks[0]\n", "gt_boxes = frame[0].boxes2d\n", "gt_masks = frame[0].segmentation\n", "\n", "frame.get_view(\n", " [\n", " gt_boxes.get_view(frame[0], title=\"Ground truth boxes\"),\n", " pred_boxes.get_view(frame[0], title=\"Predicted boxes\"),\n", " gt_masks.get_view(frame[0], title=\"Ground truth masks\"),\n", " pred_masks.get_view(frame[0], title=\"Predicted masks\"),\n", " ]\n", ").render()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "**What is next?**\n", "\n", "Know about a complex model based on *deformable attention module* in **[Training Deformable]** tutorial.\n", "
\n", "\n", "[Finetuning DETR]: finetuning_detr.rst\n", "[Training Deformable]: training_deformable_detr.rst" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 4 }