Waymo Dataset¶

This module reads Waymo Open Dataset images, camera calibrations and labels 2D/3D.

Dataset preparation¶

Waymo Open Dataset is stored originally in compressed tar file containing TFRecord files. Once the dataset is downloaded, it is needed to be decompressed and converted from TFRecord format to jpeg files and pickle files. This process is called dataset preparation and handled by the method self.prepare().

For this preparation we need to install additional packages in alodataset/prepare/waymo-requirements.txt.

Steps to prepare Waymo Open Dataset:

Download Waymo Open Dataset, decompressed it in a directory called waymo. The full path can be YOUR_PATH/waymo. This directory should have structure as follow:
YOUR_PATH/waymo/
|__testing
|  |__ *.tfrecord
|__training
|  |__ *.tfrecord
|__validation
   |__ *.tfrecord
Add in YOUR_HOME/.aloception/alodataset_config.json a pair key-value
"waymo" : "YOUR_PATH/waymo"
From the aloception root, run:
python alodataset/prepare/waymo_prepare.py
This script will convert TFRecord files in a new directory YOUR_PATH/waymo_prepared and replace this new path in YOUR_HOME/.aloception/alodataset_config.json. This conversion can takes hours to complete depending on the system hardware. In case it is stopped/killed in the middle of preparation, we can always resume the preparation by executing the script.

The new prepared directory will be as follow:

YOUR_PATH/waymo_prepared/
|__testing
|__training
|__validation

In each subdirectory, we have:

|__calib.pkl
|__camera_label.pkl
|__lidar_label.pkl
|__pose.pkl
|__image0
|__|__*.jpeg
|__image1
|__|__*.jpeg
|__image2
|__|__*.jpeg
|__image3
|__|__*.jpeg
|__image4
|__|__*.jpeg
|__velodyne

calib.pkl is a pickle file containing camera calibrations.
camera_label.pkl is a pickle file containing labels 2D: boxes, classes, track id, camera id.
lidar_label.pkl is a pickle file containing labels 3D: boxes 3d, boxes 3d projected on image, class, track id, camera id, speed, acceleration.
pose.pkl is a pickle file containing vehicle pose.
imageX folders contain images from camera X in jpeg format.
velodyne directory should be empty.

Basic usage¶

from alodataset import Split, WaymoDataset

waymo_dataset = WaymoDataset(
   # Split.VAL for data in `validation` directory,
   # use Split.TRAIN/Split.TEST for `training`/`testing` directory
   split=Split.VAL,
   labels=["gt_boxes_2d", "gt_boxes_3d"],
   sequence_size=3)

# prepare waymo dataset from tfrecord file
# if the dataset is already prepared, it simply checks the prepared dataset
# this line is optional if the dataset is fully prepared
waymo_dataset.prepare()

for frames in waymo_dataset.train_loader(batch_size=2):
   # frames is a list (with len=batch size) of dict
   # dict key is camera name
   # dict value is a Frame of shape (t, c, h, w) with t sequence_size

   # convert a list of front camera's frames into a batch
   front_frames = Frame.batch_list([frame["front"] for frame in frames])
   print(front_frames.shape) # (b, t, c, h, w), in this case b=batch=2, t=sequence_size=3
   # access to labels
   print(front_frames.boxes2d)
   print(front_frames.boxes3d)
   print(front_frames.cam_intrinsic)
   print(front_frames.cam_extrinsic)

API¶

class alodataset.waymo_dataset.WaymoDataset(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶

Bases: Generic[torch.utils.data.dataset.T_co]

CAMERAS = ['front', 'front_left', 'front_right', 'side_left', 'side_right']¶

CLASSES = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']¶

LABELS = ['gt_boxes_2d', 'gt_boxes_3d', 'camera_parameters']¶

SPLIT_FOLDERS = {<Split.VAL: 'val'>: 'validation', <Split.TRAIN: 'train'>: 'training', <Split.TEST: 'test'>: 'testing'}¶

__init__(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶

WaymoDataset

Parameters

segments: list or None: List waymo segments to load. If None, all segments will be loaded.
cameras: list or None: List of camera to use. If none, all cameras data will be loaded List could be some of [“front”, “front_left”, “front_right”, “side_left”, “side_right”, “all”]. If “all” is selected, data from cameras will be merge independently of the source.
random_step: int: None by default. Otherwise, sample t+1 randomly on each sequence.
labels: list of strings: List could be some of [“gt_boxes_2d”, “gt_boxes_3d”, “camera_parameters”]

get_frame_boxes2d(frame, camera, segment, sequence_id, idstring2int)¶

Parse preloaded dict and return BoundingBoxes2D

Parameters

frameFrame

camerastr

Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]

segmentstr

Segment name

sequence_idint

Sequence id in segment

idstring2intdict

key: str, track id string
value: int, unique value represents a track id string

idstring2int can be empty. A mapping of id string and int will be created on the fly.

Returns

BoundingBoxes2D: Shape (n, 4). Each box is associtated with label and track id

rtype: BoundingBoxes2D ..

get_frame_boxes3d(frame, camera, segment, sequence_id)¶

Parse preloaded dict and return BoundingBoxes3D

Parameters

frameFrame
camerastr: Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
segmentstr: Segment name
sequence_idint: Sequence id in segment

Returns

BoundingBoxes3D: boxes 3d, shape (n, 7)
BoundingBoxes2D: boxes 3d projected on image, shape (n, 4)

rtype: BoundingBoxes3D ..

get_frame_camera_parameters(frame, camera, segment, sequence_id)¶

Parse preloaded dict and return a tuple of CamIntrinsic and CamExtrinsic

Parameters

frameFrame
camerastr: Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
segmentstr: Segment name
sequence_idint: Sequence id in segment

Returns

Tuple[CameraIntrinsic, CameraExtrinsic]

CameraIntrinsic: shape (3, 4)
CameraExtrinsic: shape (4, 4)

rtype: Tuple[CameraIntrinsic, CameraExtrinsic] ..

get_frames(camera, segment, sequence)¶

Get a tensor Frame given a camera, a segment name and the list of sequence ids

Parameters

camerastr: Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
segmentstr: Segment name
sequencelist of int: List of sequence id

Returns

aloscene.Frame: Frame contain ground truth boxes 2d, 3d with labels and camera intrinsic/extrinsic matrix based on self.labels set at self.__init__

rtype: Frame ..

get_segments()¶

Read dataset directory based on self.dataset_dir and self.split. Return a list of segment names found in that directory.

Returns

List[str]: List of segment names

rtype: List[str] ..

getitem(idx)¶

Given item id, get a dict of which keys are camera name and value are a Frame with ground truth

Parameters

idxint: Item id, see self.read_sequences() for more details.

Returns

Dict[str, aloscene.Frame]

key: str, camera name
value: aloscene.Frame containing labels 2d, 3d and camera intrinsic/extrinsic
based on self.labels set by self.__init__

rtype: Dict[str, Frame] ..

static np_convert_waymo_to_aloception_coordinate_system(np_boxes3d)¶

Transform boxes3d from waymo coordinates to aloception coordinates

Waymo coordinates:

X forward
Y left
Z upward

Aloception coordinates:

X right
Y downward
Z forward

Parameters

np_boxes3dnp.ndarray: boxes 3d in waymo coordinates, shape (n, 7)

Returns

np.ndarray: boxes 3d in aloception coordinates, shape (n, 7)

rtype: ndarray ..

prepare(num_processes=2)¶

Prepre Waymo Open Dataset from tfrecord files. The preparation can be resumed if it was stopped suddenly.

The expected tree:

waymo_data_dir

|__validation

|__training

|__test

Each subdirectory must contains tfrecord files extracted from Waymo Open Dataset tar files.

Read TFRecord files recursively in self.dataset_dir and prepare pickle files and images in self.dataset_dir + “_prepare” directory. Once the dataset is all prepared, the path to the dir in /.aloception/alodataset_config.json will be replace by the new prepared one. If self.dataset_dir aldready ends with “_prepare”, this function does nothing.

Please install alodataset/prepare/waymo-requirements.txt

Notes

If the dataset is already prepared, this method will simply check that all file are prepared and stored into the prepared folder. If the original directory is no longer on the disk, the method will simply use the prepared dir as it is and the prepare step will be skiped.

read_sequences()¶

Populate all dictionaries used internally by this class to manage sequences, segments, etc.

Dictionaries:

self.items:
- key: int, item id
- value: dict of
  
  “sequence” : tuple of sequence id
  
  “segment” : str, segment name
self.preloaded_labels_2d:
- key: str, segment name
- value: dict
  
  key: sequence id
  
  value: dict of
  
  key: str, camera id
  
  value: list of labels as dict of keys “bbox”, “track_id”, “class”, “camera_id”
self.preloaded_labels_3d:
- key: str, segment name
- value: dict
  
  key: sequence id
  
  value: dict of
  
  key: str, camera id
  
  value: list of labels as dict of keys “bbox_proj”, “bbox_3d”, “track_id”, “class”,
  “camera_id”, “speed”, “accel”
self.preloaded_calib:
- key: str, segment name
- value: dict
  
  key: sequence id
  
  value: dict of
  
  key: “cam_intrinsic”, “cam_extrinsic”
  
  value: dict of:
  
  key: int, camera id + 1
  
  value: np.ndarray of shape (3, 4) for cam_intrisic or shape (4, 4) for cam_extrinsic

alodataset.waymo_dataset.main()¶: Main