{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Training Detr\n", "\n", "This tutorial explains how to use [LitDetr] module to train [DETR50 architecture] from scratch, using [COCO2017 detection dataset] as input.\n", "\n", "
\n", " \n", "**Goals**\n", " \n", "1. Learn the different ways to instantiate the [LitDetr] class\n", "2. Train [DETR50 architecture]\n", "3. Load trained weights and make inference with pre-trained weights\n", "\n", "
\n", "\n", "[DETR50 architecture]: https://arxiv.org/abs/2005.12872\n", "[COCO2017 detection dataset]: https://cocodataset.org/#detection-2017\n", "[LitDetr]: ../alonet/detr_training.rst#alonet.detr.train.LitDetr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. LitDetr argument levels\n", "\n", "[Aloception] is developed under the [Pytorch Lightning] framework, and provides different modules that facilitate the use of datasets and training models. \n", "\n", "
\n", "\n", "**See also**\n", " \n", "All information is availabled at:\n", " \n", " * [End-to-End Object Detection with Transformers (DETR)] \n", " * [Pytorch Lightning Module] \n", "\n", "
\n", "\n", "There are multiple ways to instantiate the module, starting with the most common one: using the default parameters\n", "\n", "[Pytorch Lightning]: https://www.pytorchlightning.ai/\n", "[Pytorch Lightning Module]: https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html\n", "[End-to-End Object Detection with Transformers (DETR)]: https://arxiv.org/abs/2005.12872\n", "[Aloception]: ../index.rst" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alonet.detr import LitDetr\n", "\n", "litdetr = LitDetr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like all modules in [Aloception] based on [Pytorch Lightning], [LitDetr] has a static method that concatenates its default parameters to other modules.\n", "\n", "[Pytorch Lightning]: https://www.pytorchlightning.ai/\n", "[Aloception]: ../index.rst\n", "[LitDetr]: ../alonet/detr_training.rst#alonet.detr.train.LitDetr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from argparse import ArgumentParser\n", "\n", "parser = ArgumentParser()\n", "parser = litdetr.add_argparse_args(parser)\n", "parser.parse_args([])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, if we want to change a specific parameter, it should be changed in class definition" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from argparse import Namespace\n", "\n", "def params2Namespace(litdetr):\n", " return Namespace(\n", " accumulate_grad_batches=litdetr.accumulate_grad_batches, \n", " gradient_clip_val=litdetr.gradient_clip_val, \n", " model_name=litdetr.model_name, \n", " weights=litdetr.weights\n", " )\n", "\n", "litdetr = LitDetr(gradient_clip_val=0.6)\n", "params2Namespace(litdetr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These parameters could be easily modified in console if we provide them to the module" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "args = parser.parse_args([]) # Remove [] to run in script\n", "litdetr = LitDetr(args)\n", "params2Namespace(litdetr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, we could use both examples to fix some parameters and use the rest as the values entered via the command line" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alonet.detr import LitDetr, DetrR50Finetune\n", "\n", "my_model = DetrR50Finetune(num_classes = 2)\n", "litdetr = LitDetr(args, model_name=\"finetune\", model=my_model)\n", "params2Namespace(litdetr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Attention**\n", " \n", "All the parameters described explicitly will replace the ones in the **args** variable.\n", "
\n", "\n", "
\n", "\n", "**See also**\n", "\n", "Since [LitDetr] is a pytorch lig based module, all functionalities could be implemented by inheriting [LitDetr] as a parent class. See the information in [Pytorch Lightning Module].\n", "
\n", "\n", "[LitDetr]: ../alonet/detr_training.rst#alonet.detr.train.LitDetr\n", "[Pytorch Lightning Module]: https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Train process\n", "\n", "
\n", "\n", "**See also**\n", "\n", "The training process is based on the [Pytorch Lightning Trainer Module]. For more information, please consult their online documentation.\n", "
\n", "\n", "In order to make an example, let's take the [COCO detection 2017 dataset] as a training base. The common pipeline is described below:\n", "\n", "[COCO detection 2017 dataset]: https://cocodataset.org/#detection-2017\n", "[Pytorch Lightning Trainer Module]: https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from argparse import ArgumentParser\n", "\n", "import alonet\n", "from alonet.detr import CocoDetection2Detr, LitDetr\n", "\n", "import torch\n", "\n", "device = torch.device(\"cuda:0\") if torch.cuda.is_available() else torch.device(\"cpu\")\n", "\n", "# Parameters definition\n", "# Build parser (concatenates arguments to modify the entire project)\n", "parser = ArgumentParser(conflict_handler=\"resolve\")\n", "parser = CocoDetection2Detr.add_argparse_args(parser)\n", "parser = LitDetr.add_argparse_args(parser)\n", "parser = alonet.common.add_argparse_args(parser) # Add common arguments in train process\n", "args = parser.parse_args([])\n", "\n", "# Dataset use to train\n", "coco_loader = CocoDetection2Detr(args)\n", "lit_detr = LitDetr(args)\n", "\n", "# Train process\n", "# args.save = True # Uncomment this line to store trained weights\n", "lit_detr.run_train(\n", " data_loader=coco_loader, \n", " args=args, \n", " project=\"detr\", \n", " expe_name=\"coco_detr\", \n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "**Attention**\n", "\n", "This code has a high computational cost and demands several hours of training, given its initialization from scratch. It is recommended to skip to the next section to see the results of the trained network.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Make inferences\n", "\n", "Once the training is finished, we can load the trained weights knowing the project and run id (`~/.aloception/project_run_id/run_id` path). For this, a function of the common module of aloception could be used:\n", "\n", "```python\n", "from argparse import Namespace\n", "from alonet.common import load_training\n", "\n", "args = Namespace(project_run_id = \"project_run_id\", run_id = \"run_id\")\n", "lit_detr = load_training(LitDetr, args = args)\n", "```\n", "\n", "Moreover, [LitDetr] allows download and load pre-trained weights for use. This is achieved by using the `weights` attribute:\n", "\n", "[LitDetr]: ../alonet/detr_training.rst#alonet.detr.train.LitDetr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lit_detr = LitDetr(weights = \"detr-r50\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we have a pre-trained model ready to make some detections." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "from alonet.detr import CocoDetection2Detr, LitDetr\n", "\n", "import torch\n", "\n", "device = torch.device(\"cuda:0\") if torch.cuda.is_available() else torch.device(\"cpu\")\n", "\n", "# Dataset use to train\n", "coco_loader = CocoDetection2Detr()\n", "lit_detr = LitDetr(weights = \"detr-r50\")\n", "lit_detr.model = lit_detr.model.eval().to(device)\n", "\n", "# Check a random result\n", "frame = next(iter(coco_loader.val_dataloader()))\n", "frame = frame[0].batch_list(frame).to(device)\n", "pred_boxes = lit_detr.inference(lit_detr(frame))[0] # Inference from forward result\n", "gt_boxes = frame[0].boxes2d\n", "\n", "frame.get_view(\n", " [\n", " gt_boxes.get_view(frame[0], title=\"Ground truth boxes\"),\n", " pred_boxes.get_view(frame[0], title=\"Predicted boxes\"),\n", " ], size = (1920,1080)\n", ").render()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "**What is next?**\n", "\n", "Learn how to train a custom architecture in **[Finetuning DETR]** tutorial. Also, know about a complex model based on *deformable attention module* in **[Training Deformable]** tutorial.\n", "
\n", "\n", "[Finetuning DETR]: finetuning_detr.rst\n", "[Training Deformable]: training_deformable_detr.rst" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 4 }