Alonet : Training your models

The alonet package rely on the aloscene& alodataset package to provide training, inference and evaluation scripts for promissing computer vision architecture. DETR, Deformable-Detr and RAFT are one of the first include models on alonet.

Our training scripts are usually split into three parts:

  • The dataset (provided by alodataset)

  • The data modules

  • The training pipeline

The provided training pipeline use pytorch lightning. Beyond the complementary use of aloscene & alodataset we provide some helper methods to quickly restore and load previous training.

Datasets

A dataset returns augmented frame tensors with the aloscene package. All datasets provided by alodataset expose a train_loader() method that will be later use within a data modules.

Here is an exmple of creating a training & validation loader using the Coco dataset.

[1]:
import alodataset

# Using sample
train_loader = alodataset.CocoBaseDataset(sample=True).train_loader(batch_size=2)

# Using the full dataset
# Training loader
train_loader = alodataset.CocoBaseDataset(
    img_folder = "train2017",
    ann_file = "annotations/instances_train2017.json"
).train_loader(batch_size=2)

# Validation loader
train_loader = alodataset.CocoBaseDataset(
    img_folder = "val2017",
    ann_file = "annotations/instances_val2017.json"
).train_loader(batch_size=2)
loading annotations into memory...
Done (t=10.74s)
creating index...
index created!
loading annotations into memory...
Done (t=0.29s)
creating index...
index created!

Data modules

Data modules is a concept introduced in pytorch ligntning. In aloception, we use data modules to adapt one dataset to fit a particular training pipeline. This adaptation can include some change on the expected Frame structure or specific augmentation suited for the target training pipeline.

Here is an example of data modules

[2]:
from argparse import ArgumentParser, Namespace
from typing import Optional
from alodataset import transforms as T
import pytorch_lightning as pl
import alodataset
import alonet

class CocoDetection2Detr(pl.LightningDataModule):
    def __init__(self, batch_size=2):
        super().__init__()
        self.batch_size = batch_size

    def train_transform(self, frame, same_on_sequence: bool = True, same_on_frames: bool = False):
        scales = [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]
        frame = T.Compose([
                T.RandomHorizontalFlip(),
                T.RandomResizeWithAspectRatio(scales, max_size=1333),
            ]
        )(frame)
        return frame.norm_resnet()

    def val_transform(self, frame):
        frame = T.RandomResizeWithAspectRatio(
            [800], max_size=1333, same_on_sequence=same_on_sequence, same_on_frames=same_on_frames
        )(frame)
        return frame.norm_resnet()

    def setup(self, stage: Optional[str] = None) -> None:
        if stage == "fit" or stage is None:
            # Setup train/val loaders
            self.coco_train = alodataset.CocoBaseDataset(
                img_folder = "train2017",
                ann_file = "annotations/instances_val2017.json"
            )
            self.coco_val = alodataset.CocoBaseDataset(
                img_folder = "val2017",
                ann_file = "annotations/instances_val2017.json"
            )

    def train_dataloader(self):
        if not hasattr(self, "coco_train"): self.setup()
        return self.coco_train.train_loader(batch_size=self.batch_size)

    def val_dataloader(self, sampler=None):
        if not hasattr(self, "coco_val"): self.setup()
        return self.coco_val.train_loader(batch_size=self.batch_size)

To know more about data modules, please refer to following pytorch ligntning documentation: https://pytorch-lightning.readthedocs.io/en/latest/extensions/datamodules.html

Training pipeline

Our training pipelines are built using Lightning modules.

Therefore, trainig is all about connecting data module with training pipeline.

[ ]:
import alonet

# Init the training pipeline
detr = alonet.detr.LitDetr()
# With the Data modules
coco_loader = CocoDetection2Detr(batch_size=2)
# Run the training using the two components
detr.run_train(data_loader=coco_loader, project="detr", expe_name="test_experiment")