Models¶
Panoptic Head is a pytorch module that implements a network to connect with the output of a Detr-based model. This new module is able to predict a segmentation features, represented by a binary mask for each object predicted by Detr model.

Block diagram of panoptic head model, taken from End-to-End Object Detection with Transformers paper¶
See also
Mask object to know the data representation of predictions.
Basic usage¶
Given that detr_panoptic
implements the Panoptic Head for a Detr-based models, first the module
have to implement and be passed as PanopticHead
parameter:
from alonet.detr import DetrR50 from alonet.detr_panoptic import PanopticHead detr_model = DetrR50() model = PanopticHead(DETR_module=detr_model)
If you want to finetune from the model pretrained on COCO dataset, a DetrFinetune models must be used:
from alonet.detr import DetrR50Finetune from alonet.detr_panoptic import PanopticHead detr_model = DetrR50Finetune(num_classes=250) model = PanopticHead(DETR_module=detr_model)
To run an inference:
from aloscene import Frame device = model.device # supposed that `model` is already defined as above # read image and preprocess image with Resnet normalization frame = Frame(IMAGE_PATH).norm_resnet() # create a batch from a list of images frames = aloscene.Frame.batch_list([frame]) frames = frames.to(device) # forward pass m_outputs = model(frames) # get boxes and MASK as aloscene.BoundingBoxes2D and aloscene.Mask from forward outputs pred_boxes, pred_masks = model.inference(m_outputs) # Display the predicted boxes frame.append_boxes2d(pred_boxes[0], "pred_boxes") frame.append_segmentation(pred_masks[0], "pred_masks") frame.get_view([frame.boxes2d, frame.segmentation]).render()
Important
PanopticHead network is able to predict the segmentation masks, follow by each box predicted for the
Detr-based models. Is for this reason that inference function return a new output: pred_masks
.
Panoptic head API¶
Panoptic module to use in object detection/segmentation tasks.
- class alonet.detr_panoptic.detr_panoptic.PanopticHead(DETR_module, freeze_detr=True, aux_loss=None, device=device(type='cpu'), weights=None, strict_load_weights=True)¶
Bases:
torch.nn.modules.module.Module
Pytorch head module to predict segmentation masks from previous boxes detection task.
- Parameters
- DETR_module
alonet.detr.detr
Object detection module based on
DETR
architecture- freeze_detrbool, optional
Freeze
DETR_module
weights in train procedure, by default True- aux_loss: bool, optional
Return aux outputs in forward step (if required), by default use
DETR_module.aux_loss
attribute value- devicetorch.device, optional
Configure module in CPU or GPU, by default
torch.device("cpu")
- weightsstr, optional
Load weights from name project, by default None
- strict_load_weightsbool
Load the weights (if any given) with strict =
True
(by default).
- DETR_module
- INPUT_MEAN_STD = ((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))¶
- forward(frames, get_filter_fn=None, **kwargs)¶
PanopticHead forward, that joint to the previous boxes predictions the new masks feature.
- Parameters
- frames
Frames
Input frame to network
- get_filter_fnCallable
Function that returns two parameters: the
dec_outputs
tensor filtered by a boolean mask per batch. It is expected that the function will at least receiveframes
andm_outputs
parameters as input. By default the function used to this purpuse isget_outs_filter()
from based model.
- frames
- Returns
- dict
It outputs a dict with the following elements:
pred_logits
: The classification logits (including no-object) for all queries. Shape = [batch_size x num_queries x (num_classes + 1)]pred_boxes
: The normalized boxes coordinates for all queries, represented as (center_x, center_y, height, width). These values are normalized in [0, 1], relative to the size of each individual image (disregarding possible padding). See PostProcess for information on how to retrieve the unnormalized bounding box.pred_masks
: Binary masks, each one to assign to predicted boxes. Shape = [batch_size x num_queries x H // 4 x W // 4]bb_outputs
: Backbone outputs, requered in this forwardenc_outputs
: Transformer encoder outputs, requered on this forwarddec_outputs
: Transformer decoder outputs, requered on this forwardpred_masks_info
: Parameters to use in inference procedureaux_outputs
: Optional, only returned when auxilary losses are activated. It is a list of dictionnaries containing the two above keys for each decoder layer.
- inference(forward_out, maskth=0.5, filters=None, **kwargs)¶
Given the model forward outputs, this method will return a set of
BoundingBoxes2D
andMask
, with its correspondingLabels
per object detected.- Parameters
- forward_outdict
Dict with the model forward outputs
- maskthfloat, optional
Threshold value to binarize the masks, by default 0.5
- filterslist, optional
List of filter to select the query predicting an object, by default None
- Returns
BoundingBoxes2D
Boxes from DETR model
Mask
Binary masks from PanopticHead, one for each box.
- training: bool¶
- alonet.detr_panoptic.detr_panoptic.main(image_path)¶
DetrR50 Panoptic Finetune¶
Module to create a custom PanopticHead
model using
DetrR50
as based model, which allows to upload a decided pretrained weights and
change the number of outputs in class_embed
layer, in order to train custom classes.
- class alonet.detr_panoptic.detr_r50_panoptic_finetune.DetrR50PanopticFinetune(num_classes, background_class=None, base_model=None, base_weights='detr-r50-panoptic', freeze_detr=False, weights=None, *args, **kwargs)¶
Bases:
alonet.detr_panoptic.detr_panoptic.PanopticHead
Pre made helpfull class to finetune the
DetrR50
and use a pretrainedPanopticHead
.- Parameters
- num_classesint
Number of classes in the
class_embed
output layer- background_classint, optional
Background class, by default None
- base_modeltorch.nn, optional
Base model to couple PanopticHead, by default
DetrR50
- base_weightsstr, optional
Load weights from original
DetrR50
+PanopticHead
, by default “detr-r50-panoptic”- freeze_detrbool, optional
Freeze
DetrR50
weights, by default False- weightsstr, optional
Weights for finetune model, by default None
- Raises
- ValueError
weights
must be a ‘.pth’ or ‘.ckpt’ file
- training: bool¶