Waymo Dataset¶
This module reads Waymo Open Dataset images, camera calibrations and labels 2D/3D.
Dataset preparation¶
Waymo Open Dataset is stored originally in compressed tar file containing TFRecord files. Once the dataset is downloaded, it is needed to be decompressed and converted from TFRecord format to jpeg files and pickle files. This process is called dataset preparation and handled by the method self.prepare().
For this preparation we need to install additional packages in alodataset/prepare/waymo-requirements.txt.
Steps to prepare Waymo Open Dataset:
Download Waymo Open Dataset, decompressed it in a directory called waymo. The full path can be YOUR_PATH/waymo. This directory should have structure as follow:
YOUR_PATH/waymo/ |__testing | |__ *.tfrecord |__training | |__ *.tfrecord |__validation |__ *.tfrecordAdd in YOUR_HOME/.aloception/alodataset_config.json a pair key-value
"waymo" : "YOUR_PATH/waymo"From the aloception root, run:
python alodataset/prepare/waymo_prepare.pyThis script will convert TFRecord files in a new directory YOUR_PATH/waymo_prepared and replace this new path in YOUR_HOME/.aloception/alodataset_config.json. This conversion can takes hours to complete depending on the system hardware. In case it is stopped/killed in the middle of preparation, we can always resume the preparation by executing the script.
The new prepared directory will be as follow:
YOUR_PATH/waymo_prepared/ |__testing |__training |__validationIn each subdirectory, we have:
|__calib.pkl |__camera_label.pkl |__lidar_label.pkl |__pose.pkl |__image0 |__|__*.jpeg |__image1 |__|__*.jpeg |__image2 |__|__*.jpeg |__image3 |__|__*.jpeg |__image4 |__|__*.jpeg |__velodyne
calib.pkl is a pickle file containing camera calibrations.
camera_label.pkl is a pickle file containing labels 2D: boxes, classes, track id, camera id.
lidar_label.pkl is a pickle file containing labels 3D: boxes 3d, boxes 3d projected on image, class, track id, camera id, speed, acceleration.
pose.pkl is a pickle file containing vehicle pose.
imageX folders contain images from camera X in jpeg format.
velodyne directory should be empty.
Basic usage¶
from alodataset import Split, WaymoDataset
waymo_dataset = WaymoDataset(
# Split.VAL for data in `validation` directory,
# use Split.TRAIN/Split.TEST for `training`/`testing` directory
split=Split.VAL,
labels=["gt_boxes_2d", "gt_boxes_3d"],
sequence_size=3)
# prepare waymo dataset from tfrecord file
# if the dataset is already prepared, it simply checks the prepared dataset
# this line is optional if the dataset is fully prepared
waymo_dataset.prepare()
for frames in waymo_dataset.train_loader(batch_size=2):
# frames is a list (with len=batch size) of dict
# dict key is camera name
# dict value is a Frame of shape (t, c, h, w) with t sequence_size
# convert a list of front camera's frames into a batch
front_frames = Frame.batch_list([frame["front"] for frame in frames])
print(front_frames.shape) # (b, t, c, h, w), in this case b=batch=2, t=sequence_size=3
# access to labels
print(front_frames.boxes2d)
print(front_frames.boxes3d)
print(front_frames.cam_intrinsic)
print(front_frames.cam_extrinsic)
API¶
- class alodataset.waymo_dataset.WaymoDataset(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]- CAMERAS = ['front', 'front_left', 'front_right', 'side_left', 'side_right']¶
- CLASSES = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']¶
- LABELS = ['gt_boxes_2d', 'gt_boxes_3d', 'camera_parameters']¶
- SPLIT_FOLDERS = {<Split.VAL: 'val'>: 'validation', <Split.TRAIN: 'train'>: 'training', <Split.TEST: 'test'>: 'testing'}¶
- __init__(segments=None, cameras=None, random_step=None, labels=[], **kwargs)¶
WaymoDataset
- Parameters
- segments: list or None
List waymo segments to load. If None, all segments will be loaded.
- cameras: list or None
List of camera to use. If none, all cameras data will be loaded List could be some of [“front”, “front_left”, “front_right”, “side_left”, “side_right”, “all”]. If “all” is selected, data from cameras will be merge independently of the source.
- random_step: int
None by default. Otherwise, sample t+1 randomly on each sequence.
- labels: list of strings
List could be some of [“gt_boxes_2d”, “gt_boxes_3d”, “camera_parameters”]
- get_frame_boxes2d(frame, camera, segment, sequence_id, idstring2int)¶
Parse preloaded dict and return BoundingBoxes2D
- Parameters
- frameFrame
- camerastr
Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
- segmentstr
Segment name
- sequence_idint
Sequence id in segment
- idstring2intdict
key: str, track id string
value: int, unique value represents a track id string
idstring2int can be empty. A mapping of id string and int will be created on the fly.
- Returns
- BoundingBoxes2D
Shape (n, 4). Each box is associtated with label and track id
- rtype
- get_frame_boxes3d(frame, camera, segment, sequence_id)¶
Parse preloaded dict and return BoundingBoxes3D
- Parameters
- frameFrame
- camerastr
Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
- segmentstr
Segment name
- sequence_idint
Sequence id in segment
- Returns
- BoundingBoxes3D
boxes 3d, shape (n, 7)
- BoundingBoxes2D
boxes 3d projected on image, shape (n, 4)
- rtype
- get_frame_camera_parameters(frame, camera, segment, sequence_id)¶
Parse preloaded dict and return a tuple of CamIntrinsic and CamExtrinsic
- Parameters
- frameFrame
- camerastr
Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
- segmentstr
Segment name
- sequence_idint
Sequence id in segment
- Returns
- Tuple[CameraIntrinsic, CameraExtrinsic]
CameraIntrinsic: shape (3, 4)
CameraExtrinsic: shape (4, 4)
- rtype
Tuple
[CameraIntrinsic
,CameraExtrinsic
] ..
- get_frames(camera, segment, sequence)¶
Get a tensor Frame given a camera, a segment name and the list of sequence ids
- Parameters
- camerastr
Camera name. One of [“front”, “front_left”, “front_right”, “side_left”, “side_right”]
- segmentstr
Segment name
- sequencelist of int
List of sequence id
- Returns
- aloscene.Frame
Frame contain ground truth boxes 2d, 3d with labels and camera intrinsic/extrinsic matrix based on self.labels set at self.__init__
- rtype
Frame
..
- get_segments()¶
Read dataset directory based on self.dataset_dir and self.split. Return a list of segment names found in that directory.
- Returns
- List[str]
List of segment names
- rtype
List
[str
] ..
- getitem(idx)¶
Given item id, get a dict of which keys are camera name and value are a Frame with ground truth
- Parameters
- idxint
Item id, see self.read_sequences() for more details.
- Returns
- Dict[str, aloscene.Frame]
key: str, camera name
- value: aloscene.Frame containing labels 2d, 3d and camera intrinsic/extrinsic
based on self.labels set by self.__init__
- rtype
Dict
[str
,Frame
] ..
- static np_convert_waymo_to_aloception_coordinate_system(np_boxes3d)¶
Transform boxes3d from waymo coordinates to aloception coordinates
- Waymo coordinates:
X forward
Y left
Z upward
- Aloception coordinates:
X right
Y downward
Z forward
- Parameters
- np_boxes3dnp.ndarray
boxes 3d in waymo coordinates, shape (n, 7)
- Returns
- np.ndarray
boxes 3d in aloception coordinates, shape (n, 7)
- rtype
ndarray
..
- prepare(num_processes=2)¶
Prepre Waymo Open Dataset from tfrecord files. The preparation can be resumed if it was stopped suddenly.
The expected tree:
waymo_data_dir
|__validation
|__training
|__test
Each subdirectory must contains tfrecord files extracted from Waymo Open Dataset tar files.
Read TFRecord files recursively in self.dataset_dir and prepare pickle files and images in self.dataset_dir + “_prepare” directory. Once the dataset is all prepared, the path to the dir in /.aloception/alodataset_config.json will be replace by the new prepared one. If self.dataset_dir aldready ends with “_prepare”, this function does nothing.
Please install alodataset/prepare/waymo-requirements.txt
Notes
If the dataset is already prepared, this method will simply check that all file are prepared and stored into the prepared folder. If the original directory is no longer on the disk, the method will simply use the prepared dir as it is and the prepare step will be skiped.
- read_sequences()¶
Populate all dictionaries used internally by this class to manage sequences, segments, etc.
- Dictionaries:
- self.items:
key: int, item id
- value: dict of
“sequence” : tuple of sequence id
“segment” : str, segment name
- self.preloaded_labels_2d:
key: str, segment name
- value: dict
key: sequence id
- value: dict of
key: str, camera id
value: list of labels as dict of keys “bbox”, “track_id”, “class”, “camera_id”
- self.preloaded_labels_3d:
key: str, segment name
- value: dict
key: sequence id
- value: dict of
key: str, camera id
- value: list of labels as dict of keys “bbox_proj”, “bbox_3d”, “track_id”, “class”,
“camera_id”, “speed”, “accel”
- self.preloaded_calib:
key: str, segment name
- value: dict
key: sequence id
- value: dict of
key: “cam_intrinsic”, “cam_extrinsic”
- value: dict of:
key: int, camera id + 1
value: np.ndarray of shape (3, 4) for cam_intrisic or shape (4, 4) for cam_extrinsic
- alodataset.waymo_dataset.main()¶
Main