Waymo Dataset

This module reads Waymo Open Dataset images, camera calibrations and labels 2D/3D.

Dataset preparation

Waymo Open Dataset is stored originally in compressed tar file containing TFRecord files. Once the dataset is downloaded, it is needed to be decompressed and converted from TFRecord format to jpeg files and pickle files. This process is called dataset preparation and handled by the method self.prepare().

For this preparation we need to install additional packages in alodataset/prepare/waymo-requirements.txt.

Steps to prepare Waymo Open Dataset:

  1. Download Waymo Open Dataset, decompressed it in a directory called waymo. The full path can be YOUR_PATH/waymo. This directory should have structure as follow:

    YOUR_PATH/waymo/
    |__testing
    |  |__ *.tfrecord
    |__training
    |  |__ *.tfrecord
    |__validation
       |__ *.tfrecord
    
  2. Add in YOUR_HOME/.aloception/alodataset_config.json a pair key-value

    "waymo" : "YOUR_PATH/waymo"
    
  3. From the aloception root, run:

    python alodataset/prepare/waymo_prepare.py
    

    This script will convert TFRecord files in a new directory YOUR_PATH/waymo_prepared and replace this new path in YOUR_HOME/.aloception/alodataset_config.json. This conversion can takes hours to complete depending on the system hardware. In case it is stopped/killed in the middle of preparation, we can always resume the preparation by executing the script.

The new prepared directory will be as follow:

YOUR_PATH/waymo_prepared/
|__testing
|__training
|__validation

In each subdirectory, we have:

|__calib.pkl
|__camera_label.pkl
|__lidar_label.pkl
|__pose.pkl
|__image0
|__|__*.jpeg
|__image1
|__|__*.jpeg
|__image2
|__|__*.jpeg
|__image3
|__|__*.jpeg
|__image4
|__|__*.jpeg
|__velodyne
  • calib.pkl is a pickle file containing camera calibrations.

  • camera_label.pkl is a pickle file containing labels 2D: boxes, classes, track id, camera id.

  • lidar_label.pkl is a pickle file containing labels 3D: boxes 3d, boxes 3d projected on image, class, track id, camera id, speed, acceleration.

  • pose.pkl is a pickle file containing vehicle pose.

  • imageX folders contain images from camera X in jpeg format.

  • velodyne directory should be empty.

Basic usage

from alodataset import Split, WaymoDataset

waymo_dataset = WaymoDataset(
   # Split.VAL for data in `validation` directory,
   # use Split.TRAIN/Split.TEST for `training`/`testing` directory
   split=Split.VAL,
   labels=["gt_boxes_2d", "gt_boxes_3d"],
   sequence_size=3)

# prepare waymo dataset from tfrecord file
# if the dataset is already prepared, it simply checks the prepared dataset
# this line is optional if the dataset is fully prepared
waymo_dataset.prepare()

for frames in waymo_dataset.train_loader(batch_size=2):
   # frames is a list (with len=batch size) of dict
   # dict key is camera name
   # dict value is a Frame of shape (t, c, h, w) with t sequence_size

   # convert a list of front camera's frames into a batch
   front_frames = Frame.batch_list([frame["front"] for frame in frames])
   print(front_frames.shape) # (b, t, c, h, w), in this case b=batch=2, t=sequence_size=3
   # access to labels
   print(front_frames.boxes2d)
   print(front_frames.boxes3d)
   print(front_frames.cam_intrinsic)
   print(front_frames.cam_extrinsic)

API

class alodataset.waymo_dataset.WaymoDataset(segments=None, cameras=None, random_step=None, labels=[], **kwargs)

Bases: Generic[torch.utils.data.dataset.T_co]

CAMERAS = ['front', 'front_left', 'front_right', 'side_left', 'side_right']
CLASSES = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']
LABELS = ['gt_boxes_2d', 'gt_boxes_3d', 'camera_parameters']
SPLIT_FOLDERS = {<Split.VAL: 'val'>: 'validation', <Split.TRAIN: 'train'>: 'training', <Split.TEST: 'test'>: 'testing'}
__init__(segments=None, cameras=None, random_step=None, labels=[], **kwargs)

WaymoDataset

Parameters
segments: list or None

List waymo segments to load. If None, all segments will be loaded.

cameras: list or None

List of camera to use. If none, all cameras data will be loaded List could be some of [“front”, “front_left”, “front_right”, “side_left”, “side_right”, “all”]. If “all” is selected, data from cameras will be merge independently of the source.

random_step: int

None by default. Otherwise, sample t+1 randomly on each sequence.

labels: list of strings

List could be some of [“gt_boxes_2d”, “gt_boxes_3d”, “camera_parameters”]

get_frame_boxes2d(frame, camera, segment, sequence_id, idstring2int)

Parse preloaded dict and return BoundingBoxes2D

Parameters
frameFrame
camerastr

Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]

segmentstr

Segment name

sequence_idint

Sequence id in segment

idstring2intdict
  • key: str, track id string

  • value: int, unique value represents a track id string

idstring2int can be empty. A mapping of id string and int will be created on the fly.

Returns
BoundingBoxes2D

Shape (n, 4). Each box is associtated with label and track id

rtype

BoundingBoxes2D ..

get_frame_boxes3d(frame, camera, segment, sequence_id)

Parse preloaded dict and return BoundingBoxes3D

Parameters
frameFrame
camerastr

Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]

segmentstr

Segment name

sequence_idint

Sequence id in segment

Returns
BoundingBoxes3D

boxes 3d, shape (n, 7)

BoundingBoxes2D

boxes 3d projected on image, shape (n, 4)

rtype

BoundingBoxes3D ..

get_frame_camera_parameters(frame, camera, segment, sequence_id)

Parse preloaded dict and return a tuple of CamIntrinsic and CamExtrinsic

Parameters
frameFrame
camerastr

Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]

segmentstr

Segment name

sequence_idint

Sequence id in segment

Returns
Tuple[CameraIntrinsic, CameraExtrinsic]
  • CameraIntrinsic: shape (3, 4)

  • CameraExtrinsic: shape (4, 4)

rtype

Tuple[CameraIntrinsic, CameraExtrinsic] ..

get_frames(camera, segment, sequence)

Get a tensor Frame given a camera, a segment name and the list of sequence ids

Parameters
camerastr

Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]

segmentstr

Segment name

sequencelist of int

List of sequence id

Returns
aloscene.Frame

Frame contain ground truth boxes 2d, 3d with labels and camera intrinsic/extrinsic matrix based on self.labels set at self.__init__

rtype

Frame ..

get_segments()

Read dataset directory based on self.dataset_dir and self.split. Return a list of segment names found in that directory.

Returns
List[str]

List of segment names

rtype

List[str] ..

getitem(idx)

Given item id, get a dict of which keys are camera name and value are a Frame with ground truth

Parameters
idxint

Item id, see self.read_sequences() for more details.

Returns
Dict[str, aloscene.Frame]
  • key: str, camera name

  • value: aloscene.Frame containing labels 2d, 3d and camera intrinsic/extrinsic

    based on self.labels set by self.__init__

rtype

Dict[str, Frame] ..

static np_convert_waymo_to_aloception_coordinate_system(np_boxes3d)

Transform boxes3d from waymo coordinates to aloception coordinates

Waymo coordinates:
  • X forward

  • Y left

  • Z upward

Aloception coordinates:
  • X right

  • Y downward

  • Z forward

Parameters
np_boxes3dnp.ndarray

boxes 3d in waymo coordinates, shape (n, 7)

Returns
np.ndarray

boxes 3d in aloception coordinates, shape (n, 7)

rtype

ndarray ..

prepare(num_processes=2)

Prepre Waymo Open Dataset from tfrecord files. The preparation can be resumed if it was stopped suddenly.

The expected tree:

waymo_data_dir

|__validation

|__training

|__test

Each subdirectory must contains tfrecord files extracted from Waymo Open Dataset tar files.

Read TFRecord files recursively in self.dataset_dir and prepare pickle files and images in self.dataset_dir + “_prepare” directory. Once the dataset is all prepared, the path to the dir in /.aloception/alodataset_config.json will be replace by the new prepared one. If self.dataset_dir aldready ends with “_prepare”, this function does nothing.

Please install alodataset/prepare/waymo-requirements.txt

Notes

If the dataset is already prepared, this method will simply check that all file are prepared and stored into the prepared folder. If the original directory is no longer on the disk, the method will simply use the prepared dir as it is and the prepare step will be skiped.

read_sequences()

Populate all dictionaries used internally by this class to manage sequences, segments, etc.

Dictionaries:
  • self.items:
    • key: int, item id

    • value: dict of
      • “sequence” : tuple of sequence id

      • “segment” : str, segment name

  • self.preloaded_labels_2d:
    • key: str, segment name

    • value: dict
      • key: sequence id

      • value: dict of
        • key: str, camera id

        • value: list of labels as dict of keys “bbox”, “track_id”, “class”, “camera_id”

  • self.preloaded_labels_3d:
    • key: str, segment name

    • value: dict
      • key: sequence id

      • value: dict of
        • key: str, camera id

        • value: list of labels as dict of keys “bbox_proj”, “bbox_3d”, “track_id”, “class”,

          “camera_id”, “speed”, “accel”

  • self.preloaded_calib:
    • key: str, segment name

    • value: dict
      • key: sequence id

      • value: dict of
        • key: “cam_intrinsic”, “cam_extrinsic”

        • value: dict of:
          • key: int, camera id + 1

          • value: np.ndarray of shape (3, 4) for cam_intrisic or shape (4, 4) for cam_extrinsic

alodataset.waymo_dataset.main()

Main