Alonet : Training your models¶
The alonet
package rely on the aloscene
& alodataset
package to provide training, inference and evaluation scripts for promissing computer vision architecture. DETR, Deformable-Detr and RAFT are one of the first include models on alonet.
Our training scripts are usually split into three parts:
The dataset (provided by
alodataset
)The data modules
The training pipeline
The provided training pipeline use pytorch lightning. Beyond the complementary use of aloscene
& alodataset
we provide some helper methods to quickly restore and load previous training.
Datasets¶
A dataset returns augmented frame tensors with the aloscene
package. All datasets provided by alodataset
expose a train_loader()
method that will be later use within a data modules.
Here is an exmple of creating a training & validation loader using the Coco dataset.
[1]:
import alodataset
# Using sample
train_loader = alodataset.CocoBaseDataset(sample=True).train_loader(batch_size=2)
# Using the full dataset
# Training loader
train_loader = alodataset.CocoBaseDataset(
img_folder = "train2017",
ann_file = "annotations/instances_train2017.json"
).train_loader(batch_size=2)
# Validation loader
train_loader = alodataset.CocoBaseDataset(
img_folder = "val2017",
ann_file = "annotations/instances_val2017.json"
).train_loader(batch_size=2)
loading annotations into memory...
Done (t=10.74s)
creating index...
index created!
loading annotations into memory...
Done (t=0.29s)
creating index...
index created!
Data modules¶
Data modules is a concept introduced in pytorch ligntning. In aloception, we use data modules to adapt one dataset to fit a particular training pipeline. This adaptation can include some change on the expected Frame structure or specific augmentation suited for the target training pipeline.
Here is an example of data modules
[2]:
from argparse import ArgumentParser, Namespace
from typing import Optional
from alodataset import transforms as T
import pytorch_lightning as pl
import alodataset
import alonet
class CocoDetection2Detr(pl.LightningDataModule):
def __init__(self, batch_size=2):
super().__init__()
self.batch_size = batch_size
def train_transform(self, frame, same_on_sequence: bool = True, same_on_frames: bool = False):
scales = [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]
frame = T.Compose([
T.RandomHorizontalFlip(),
T.RandomResizeWithAspectRatio(scales, max_size=1333),
]
)(frame)
return frame.norm_resnet()
def val_transform(self, frame):
frame = T.RandomResizeWithAspectRatio(
[800], max_size=1333, same_on_sequence=same_on_sequence, same_on_frames=same_on_frames
)(frame)
return frame.norm_resnet()
def setup(self, stage: Optional[str] = None) -> None:
if stage == "fit" or stage is None:
# Setup train/val loaders
self.coco_train = alodataset.CocoBaseDataset(
img_folder = "train2017",
ann_file = "annotations/instances_val2017.json"
)
self.coco_val = alodataset.CocoBaseDataset(
img_folder = "val2017",
ann_file = "annotations/instances_val2017.json"
)
def train_dataloader(self):
if not hasattr(self, "coco_train"): self.setup()
return self.coco_train.train_loader(batch_size=self.batch_size)
def val_dataloader(self, sampler=None):
if not hasattr(self, "coco_val"): self.setup()
return self.coco_val.train_loader(batch_size=self.batch_size)
To know more about data modules, please refer to following pytorch ligntning documentation: https://pytorch-lightning.readthedocs.io/en/latest/extensions/datamodules.html
Training pipeline¶
Our training pipelines are built using Lightning modules.
Therefore, trainig is all about connecting data module with training pipeline.
[ ]:
import alonet
# Init the training pipeline
detr = alonet.detr.LitDetr()
# With the Data modules
coco_loader = CocoDetection2Detr(batch_size=2)
# Run the training using the two components
detr.run_train(data_loader=coco_loader, project="detr", expe_name="test_experiment")