Waymo Dataset¶
This module reads Waymo Open Dataset images, camera calibrations and labels 2D/3D.
Dataset preparation¶
Waymo Open Dataset is stored originally in compressed tar file containing TFRecord files. Once the dataset is downloaded, it is needed to be decompressed and converted from TFRecord format to jpeg files and pickle files. This process is called dataset preparation and handled by the method self.prepare().
For this preparation we need to install additional packages in alodataset/prepare/waymo-requirements.txt.
Steps to prepare Waymo Open Dataset:
Download Waymo Open Dataset, decompressed it in a directory called waymo. The full path can be YOUR_PATH/waymo. This directory should have structure as follow:
YOUR_PATH/waymo/ |__testing | |__ *.tfrecord |__training | |__ *.tfrecord |__validation |__ *.tfrecord
Add in YOUR_HOME/.aloception/alodataset_config.json a pair key-value
"waymo" : "YOUR_PATH/waymo"
From the aloception root, run:
python alodataset/prepare/waymo_prepare.pyThis script will convert TFRecord files in a new directory YOUR_PATH/waymo_prepared and replace this new path in YOUR_HOME/.aloception/alodataset_config.json. This conversion can takes hours to complete depending on the system hardware. In case it is stopped/killed in the middle of preparation, we can always resume the preparation by executing the script.
The new prepared directory will be as follow:
YOUR_PATH/waymo_prepared/ |__testing |__training |__validationIn each subdirectory, we have:
|__calib.pkl |__camera_label.pkl |__lidar_label.pkl |__pose.pkl |__image0 |__|__*.jpeg |__image1 |__|__*.jpeg |__image2 |__|__*.jpeg |__image3 |__|__*.jpeg |__image4 |__|__*.jpeg |__velodyne
- calib.pkl is a pickle file containing camera calibrations. 
- camera_label.pkl is a pickle file containing labels 2D: boxes, classes, track id, camera id. 
- lidar_label.pkl is a pickle file containing labels 3D: boxes 3d, boxes 3d projected on image, class, track id, camera id, speed, acceleration. 
- pose.pkl is a pickle file containing vehicle pose. 
- imageX folders contain images from camera X in jpeg format. 
- velodyne directory should be empty. 
Basic usage¶
from alodataset import Split, WaymoDataset
waymo_dataset = WaymoDataset(
   # Split.VAL for data in `validation` directory,
   # use Split.TRAIN/Split.TEST for `training`/`testing` directory
   split=Split.VAL,
   labels=["gt_boxes_2d", "gt_boxes_3d"],
   sequence_size=3)
# prepare waymo dataset from tfrecord file
# if the dataset is already prepared, it simply checks the prepared dataset
# this line is optional if the dataset is fully prepared
waymo_dataset.prepare()
for frames in waymo_dataset.train_loader(batch_size=2):
   # frames is a list (with len=batch size) of dict
   # dict key is camera name
   # dict value is a Frame of shape (t, c, h, w) with t sequence_size
   # convert a list of front camera's frames into a batch
   front_frames = Frame.batch_list([frame["front"] for frame in frames])
   print(front_frames.shape) # (b, t, c, h, w), in this case b=batch=2, t=sequence_size=3
   # access to labels
   print(front_frames.boxes2d)
   print(front_frames.boxes3d)
   print(front_frames.cam_intrinsic)
   print(front_frames.cam_extrinsic)
API¶
- class alodataset.waymo_dataset.WaymoDataset(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶
- Bases: - Generic[- torch.utils.data.dataset.T_co]- CAMERAS = ['front', 'front_left', 'front_right', 'side_left', 'side_right']¶
 - CLASSES = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']¶
 - LABELS = ['gt_boxes_2d', 'gt_boxes_3d', 'camera_parameters']¶
 - SPLIT_FOLDERS = {<Split.VAL: 'val'>: 'validation', <Split.TRAIN: 'train'>: 'training', <Split.TEST: 'test'>: 'testing'}¶
 - __init__(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶
- WaymoDataset - Parameters
- segments: list or None
- List waymo segments to load. If None, all segments will be loaded. 
- cameras: list or None
- List of camera to use. If none, all cameras data will be loaded List could be some of [“front”, “front_left”, “front_right”, “side_left”, “side_right”, “all”]. If “all” is selected, data from cameras will be merge independently of the source. 
- random_step: int
- None by default. Otherwise, sample t+1 randomly on each sequence. 
- labels: list of strings
- List could be some of [“gt_boxes_2d”, “gt_boxes_3d”, “camera_parameters”] 
 
 
 - get_frame_boxes2d(frame, camera, segment, sequence_id, idstring2int)¶
- Parse preloaded dict and return BoundingBoxes2D - Parameters
- frameFrame
- camerastr
- Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”] 
- segmentstr
- Segment name 
- sequence_idint
- Sequence id in segment 
- idstring2intdict
- key: str, track id string 
- value: int, unique value represents a track id string 
 - idstring2int can be empty. A mapping of id string and int will be created on the fly. 
 
- Returns
- BoundingBoxes2D
- Shape (n, 4). Each box is associtated with label and track id 
 - rtype
 
 
 - get_frame_boxes3d(frame, camera, segment, sequence_id)¶
- Parse preloaded dict and return BoundingBoxes3D - Parameters
- frameFrame
- camerastr
- Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”] 
- segmentstr
- Segment name 
- sequence_idint
- Sequence id in segment 
 
- Returns
- BoundingBoxes3D
- boxes 3d, shape (n, 7) 
- BoundingBoxes2D
- boxes 3d projected on image, shape (n, 4) 
 - rtype
 
 
 - get_frame_camera_parameters(frame, camera, segment, sequence_id)¶
- Parse preloaded dict and return a tuple of CamIntrinsic and CamExtrinsic - Parameters
- frameFrame
- camerastr
- Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”] 
- segmentstr
- Segment name 
- sequence_idint
- Sequence id in segment 
 
- Returns
- Tuple[CameraIntrinsic, CameraExtrinsic]
- CameraIntrinsic: shape (3, 4) 
- CameraExtrinsic: shape (4, 4) 
 
 - rtype
- Tuple[- CameraIntrinsic,- CameraExtrinsic] ..
 
 
 - get_frames(camera, segment, sequence)¶
- Get a tensor Frame given a camera, a segment name and the list of sequence ids - Parameters
- camerastr
- Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”] 
- segmentstr
- Segment name 
- sequencelist of int
- List of sequence id 
 
- Returns
- aloscene.Frame
- Frame contain ground truth boxes 2d, 3d with labels and camera intrinsic/extrinsic matrix based on self.labels set at self.__init__ 
 - rtype
- Frame..
 
 
 - get_segments()¶
- Read dataset directory based on self.dataset_dir and self.split. Return a list of segment names found in that directory. - Returns
- List[str]
- List of segment names 
 - rtype
- List[- str] ..
 
 
 - getitem(idx)¶
- Given item id, get a dict of which keys are camera name and value are a Frame with ground truth - Parameters
- idxint
- Item id, see self.read_sequences() for more details. 
 
- Returns
- Dict[str, aloscene.Frame]
- key: str, camera name 
- value: aloscene.Frame containing labels 2d, 3d and camera intrinsic/extrinsic
- based on self.labels set by self.__init__ 
 
 
 - rtype
- Dict[- str,- Frame] ..
 
 
 - static np_convert_waymo_to_aloception_coordinate_system(np_boxes3d)¶
- Transform boxes3d from waymo coordinates to aloception coordinates - Waymo coordinates:
- X forward 
- Y left 
- Z upward 
 
- Aloception coordinates:
- X right 
- Y downward 
- Z forward 
 
 - Parameters
- np_boxes3dnp.ndarray
- boxes 3d in waymo coordinates, shape (n, 7) 
 
- Returns
- np.ndarray
- boxes 3d in aloception coordinates, shape (n, 7) 
 - rtype
- ndarray..
 
 
 - prepare(num_processes=2)¶
- Prepre Waymo Open Dataset from tfrecord files. The preparation can be resumed if it was stopped suddenly. - The expected tree: - waymo_data_dir - |__validation - |__training - |__test - Each subdirectory must contains tfrecord files extracted from Waymo Open Dataset tar files. - Read TFRecord files recursively in self.dataset_dir and prepare pickle files and images in self.dataset_dir + “_prepare” directory. Once the dataset is all prepared, the path to the dir in /.aloception/alodataset_config.json will be replace by the new prepared one. If self.dataset_dir aldready ends with “_prepare”, this function does nothing. - Please install alodataset/prepare/waymo-requirements.txt - Notes - If the dataset is already prepared, this method will simply check that all file are prepared and stored into the prepared folder. If the original directory is no longer on the disk, the method will simply use the prepared dir as it is and the prepare step will be skiped. 
 - read_sequences()¶
- Populate all dictionaries used internally by this class to manage sequences, segments, etc. - Dictionaries:
- self.items:
- key: int, item id 
- value: dict of
- “sequence” : tuple of sequence id 
- “segment” : str, segment name 
 
 
 
 
- self.preloaded_labels_2d:
- key: str, segment name 
- value: dict
- key: sequence id 
- value: dict of
- key: str, camera id 
- value: list of labels as dict of keys “bbox”, “track_id”, “class”, “camera_id” 
 
 
 
 
 
 
- self.preloaded_labels_3d:
- key: str, segment name 
- value: dict
- key: sequence id 
- value: dict of
- key: str, camera id 
- value: list of labels as dict of keys “bbox_proj”, “bbox_3d”, “track_id”, “class”,
- “camera_id”, “speed”, “accel” 
 
 
 
 
 
 
 
- self.preloaded_calib:
- key: str, segment name 
- value: dict
- key: sequence id 
- value: dict of
- key: “cam_intrinsic”, “cam_extrinsic” 
- value: dict of:
- key: int, camera id + 1 
- value: np.ndarray of shape (3, 4) for cam_intrisic or shape (4, 4) for cam_extrinsic 
 
 
 
 
 
 
 
 
 
 
 
- alodataset.waymo_dataset.main()¶
- Main