Models

Panoptic Head is a pytorch module that implements a network to connect with the output of a Detr-based model. This new module is able to predict a segmentation features, represented by a binary mask for each object predicted by Detr model.

Panoptic Head model

Block diagram of panoptic head model, taken from End-to-End Object Detection with Transformers paper

See also

Mask object to know the data representation of predictions.

Basic usage

Given that detr_panoptic implements the Panoptic Head for a Detr-based models, first the module have to implement and be passed as PanopticHead parameter:

from alonet.detr import DetrR50
from alonet.detr_panoptic import PanopticHead

detr_model = DetrR50()
model = PanopticHead(DETR_module=detr_model)

If you want to finetune from the model pretrained on COCO dataset, a DetrFinetune models must be used:

from alonet.detr import DetrR50Finetune
from alonet.detr_panoptic import PanopticHead

detr_model = DetrR50Finetune(num_classes=250)
model = PanopticHead(DETR_module=detr_model)

To run an inference:

from aloscene import Frame
device = model.device # supposed that `model` is already defined as above

# read image and preprocess image with Resnet normalization
frame = Frame(IMAGE_PATH).norm_resnet()
# create a batch from a list of images
frames = aloscene.Frame.batch_list([frame])
frames = frames.to(device)

# forward pass
m_outputs = model(frames)
# get boxes and MASK as aloscene.BoundingBoxes2D and aloscene.Mask from forward outputs
pred_boxes, pred_masks = model.inference(m_outputs)

# Display the predicted boxes
frame.append_boxes2d(pred_boxes[0], "pred_boxes")
frame.append_segmentation(pred_masks[0], "pred_masks")
frame.get_view([frame.boxes2d, frame.segmentation]).render()

Important

PanopticHead network is able to predict the segmentation masks, follow by each box predicted for the Detr-based models. Is for this reason that inference function return a new output: pred_masks.

Panoptic head API

Panoptic module to use in object detection/segmentation tasks.

class alonet.detr_panoptic.detr_panoptic.PanopticHead(DETR_module, freeze_detr=True, aux_loss=None, device=device(type='cpu'), weights=None, strict_load_weights=True)

Bases: torch.nn.modules.module.Module

Pytorch head module to predict segmentation masks from previous boxes detection task.

Parameters
DETR_modulealonet.detr.detr

Object detection module based on DETR architecture

freeze_detrbool, optional

Freeze DETR_module weights in train procedure, by default True

aux_loss: bool, optional

Return aux outputs in forward step (if required), by default use DETR_module.aux_loss attribute value

devicetorch.device, optional

Configure module in CPU or GPU, by default torch.device("cpu")

weightsstr, optional

Load weights from name project, by default None

strict_load_weightsbool

Load the weights (if any given) with strict = True (by default).

INPUT_MEAN_STD = ((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
forward(frames, get_filter_fn=None, **kwargs)

PanopticHead forward, that joint to the previous boxes predictions the new masks feature.

Parameters
framesFrames

Input frame to network

get_filter_fnCallable

Function that returns two parameters: the dec_outputs tensor filtered by a boolean mask per batch. It is expected that the function will at least receive frames and m_outputs parameters as input. By default the function used to this purpuse is get_outs_filter() from based model.

Returns
dict

It outputs a dict with the following elements:

  • pred_logits: The classification logits (including no-object) for all queries. Shape = [batch_size x num_queries x (num_classes + 1)]

  • pred_boxes: The normalized boxes coordinates for all queries, represented as (center_x, center_y, height, width). These values are normalized in [0, 1], relative to the size of each individual image (disregarding possible padding). See PostProcess for information on how to retrieve the unnormalized bounding box.

  • pred_masks: Binary masks, each one to assign to predicted boxes. Shape = [batch_size x num_queries x H // 4 x W // 4]

  • bb_outputs: Backbone outputs, requered in this forward

  • enc_outputs: Transformer encoder outputs, requered on this forward

  • dec_outputs: Transformer decoder outputs, requered on this forward

  • pred_masks_info: Parameters to use in inference procedure

  • aux_outputs: Optional, only returned when auxilary losses are activated. It is a list of dictionnaries containing the two above keys for each decoder layer.

inference(forward_out, maskth=0.5, filters=None, **kwargs)

Given the model forward outputs, this method will return a set of BoundingBoxes2D and Mask, with its corresponding Labels per object detected.

Parameters
forward_outdict

Dict with the model forward outputs

maskthfloat, optional

Threshold value to binarize the masks, by default 0.5

filterslist, optional

List of filter to select the query predicting an object, by default None

Returns
BoundingBoxes2D

Boxes from DETR model

Mask

Binary masks from PanopticHead, one for each box.

training: bool
alonet.detr_panoptic.detr_panoptic.main(image_path)

DetrR50 Panoptic Finetune

Module to create a custom PanopticHead model using DetrR50 as based model, which allows to upload a decided pretrained weights and change the number of outputs in class_embed layer, in order to train custom classes.

class alonet.detr_panoptic.detr_r50_panoptic_finetune.DetrR50PanopticFinetune(num_classes, background_class=None, base_model=None, base_weights='detr-r50-panoptic', freeze_detr=False, weights=None, *args, **kwargs)

Bases: alonet.detr_panoptic.detr_panoptic.PanopticHead

Pre made helpfull class to finetune the DetrR50 and use a pretrained PanopticHead.

Parameters
num_classesint

Number of classes in the class_embed output layer

background_classint, optional

Background class, by default None

base_modeltorch.nn, optional

Base model to couple PanopticHead, by default DetrR50

base_weightsstr, optional

Load weights from original DetrR50 + PanopticHead, by default “detr-r50-panoptic”

freeze_detrbool, optional

Freeze DetrR50 weights, by default False

weightsstr, optional

Weights for finetune model, by default None

Raises
ValueError

weights must be a ‘.pth’ or ‘.ckpt’ file

training: bool