ProcSy: Procedural Synthetic Dataset Generation Towards Influence Factor Studies Of Semantic Segmentation Networks

Real-world, large-scale semantic segmentation datasets are expensive and time-consuming to create. Thus, the research community has explored the use of video game worlds and simulator environments to produce large-scale synthetic datasets, mainly to supplement the real-world ones for training deep neural networks. Another use of synthetic datasets is to enable highly controlled and repeatable experiments, thanks to the ability to manipulate the content and rendering of the synthesized imagery. To this end, we outline a method to generate an arbitrarily large, semantic segmentation dataset reflecting real-world features, while minimizing the required cost and man-hours. We demonstrate its use by generating ProcSy (pronounced "proxy"), a synthetic dataset for semantic segmentation, which is modeled on a real-world urban environment and features a range of variable influence factors, such as weather and lighting. Our experiments investigate impact of the factors on performance of a state-of-the-art deep network. Among others, we show that including as little as 3% of rainy images in the training set, improved the mIoU of the network on rainy images by about 10%, while training with more than 15% rainy images has diminishing returns.  We provide ProcSy dataset, along with generated 3D assets and code, as supplementary material.

Remote video URL
ProcSy dataset sample frame
example effects of variational weather/lighting on semantic segmenter performance

Paper

The full paper can be accessed on CVF Open Access.  Please follow this link.

Dataset

Type Link(s)
Base RGB Images (26.5 gb)

Part 1      Part 2      Part 3      Part 4      Part 5     
Part 6      Part 7      Part 8      Part 9      Part 10
Part 11    Part 12    Part 13    Part 14    Part 15
Part 16    Part 17    Part 18    Part 19    Part 20
Part 21    Part 22    Part 23    Part 24    Part 25
Part 26    Part 27

GT_ID Images (499 mb) Part 1
Depth Images (8.5 gb)

Part 1    Part 2    Part 3    Part 4    Part 5
Part 6    Part 7    Part 8    Part 9

Vehicle Occlusion Maps (82 mb) Part 1
Weather and Lighting Variational RGB Images (27.5 gb)

Part 1      Part 2      Part 3      Part 4      Part 5
Part 6      Part 7      Part 8      Part 9      Part 10
Part 11    Part 12    Part 13    Part 14    Part 15
Part 16    Part 17    Part 18    Part 19    Part 20
Part 21    Part 22    Part 23    Part 24    Part 25
Part 26    Part 27    Part 28

License agreement

This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree:

  1. That the dataset comes “AS IS”, without express or implied warranty. Although every effort has been made to ensure accuracy, we (Waterloo Intelligent Systems Engineering Lab, University of Waterloo, Canada) do not accept any responsibility for errors or omissions.
  2. That you include a reference to the ProcSy Dataset in any work that makes use of the dataset. For research papers, cite our preferred publication; for other media cite ProcSy website.
  3. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character.
  4. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
  5. That all rights not expressly granted to you are reserved by us (Waterloo Intelligent Systems Engineering Lab, University of Waterloo, Canada).

UE4 Project (modified CARLA 0.9.2; procedural assets)

(24.5 gb)
Part 1      Part 2      Part 3      Part 4      Part 5      Part 6      Part 7      Part 8      Part 9      Part 10
Part 11    Part 12    Part 13    Part 14    Part 15    Part 16    Part 17    Part 18    Part 19    Part 20
Part 21    Part 22    Part 23    Part 24    Part 25
list of non-Content modifications on CARLA 0.9.2

TODO

  • create git repository for project scripts
  • document project scripts and modifications to CARLA files
  • study effects of combination of influence factors
  • understand correlation of weather/lighting variations with real-world data