City digital twins help train deep learning models to separate building facades

City digital twins help train deep learning models to separate building facades

Researchers from Osaka University find that images of city digital twins, created using 3D models and game engines, can be combined with images of the real city to easily create deep-learning model training data for most modern architecture

Sep 6, 2022Engineering

Game engines were originally developed to build imaginary worlds for entertainment. However, these same engines can be used to build copies of real environments, that is, digital twins. Researchers from Osaka University have found a way to use the images that were automatically generated by digital city twins to train deep learning models that can efficiently analyze images of real cities and accurately separate the buildings that appear in them.

A convolutional neural network is a deep learning neural network designed for processing structured arrays of data such as images. Such advancements in deep learning have fundamentally changed the way tasks, like architectural segmentation, are performed. However, an accurate deep convolutional neural network (DCNN) model needs a large volume of labeled training data and labeling these data can be a slow and extremely expensive manual undertaking.

To create the synthetic digital city twin data, the investigators used a 3D city model from the PLATEAU platform, which contains 3D models of most Japanese cities at an extremely high level of detail. They loaded this model into the Unity game engine and created a camera setup on a virtual car, which drove around the city and acquired the virtual data images under various lighting and weather conditions. The Google Maps API was then used to obtain real street-level images of the same study area for the experiments.

The researchers found that the digital city twin data leads to better results than purely virtual data with no real-world counterpart. Furthermore, adding synthetic data to a real dataset improves segmentation accuracy. However, most importantly, the investigators found that when a certain fraction of real data is included in the digital city twin synthetic dataset, the segmentation accuracy of the DCNN is boosted significantly. In fact, its performance becomes competitive with that of a DCNN trained on 100% real data. “These results reveal that our proposed synthetic dataset could potentially replace all the real images in the training set,” says Tomohiro Fukuda, the corresponding author of the paper.

Automatically separating out the individual building facades that appear in an image is useful for construction management and architecture design, large-scale measurements for retrofits and energy analysis, and even visualizing building facades that have been demolished. The system was tested on multiple cities, demonstrating the proposed framework’s transferability. The hybrid dataset of real and synthetic data yields promising prediction results for most modern architectural styles. This makes it a promising approach for training DCNNs for architectural segmentation tasks in the future – without the need for costly manual data annotation.

20220906_2_fig_1.png

Fig. 1

Comparison of manually annotated datasets and automatically generated synthetic datasets. The conventional method requires images to be labeled by hand when the training set is produced, whereas our proposed system can automatically create synthetic data with instance annotations using digital assets from a city digital twin.

Credit: 2022 Jiaxin Zhang et al., Automatic generation of synthetic datasets from a city digital twin for use in the instance segmentation of building facades, Journal of Computational Design and Engineering

20220906_2_fig_2.jpg

Fig. 2

Three-dimensional city model of our study area. (a) Example of a city digital twin with its real-world street-view counterpart (Wangan-doro Avenue, Tokyo; March 2021; latitude: 35.6283, longitude: 139.7782). (b) Aerial view of the city digital twin.

Credit: 2022 Jiaxin Zhang et al., Automatic generation of synthetic datasets from a city digital twin for use in the instance segmentation of building facades, Journal of Computational Design and Engineering

20220906_2_fig_3.png

Fig. 3

Qualitative results for different types and sizes of buildings when Mask R-CNN is trained using HSRBFIA (Hybrid Collection of Synthetic and Real-world Building Facade Images and Annotations) datasets with different ratios of synthetic to real data: (a) low-rise houses in Osaka; (b) low-rise houses in Los Angeles; (c) high-rise houses in New York City; (d) complex facades in Shanghai. (The red dashed rectangles highlight parts of the street-view images that were prone to failure during facade instance segmentation.)

Credit: 2022 Jiaxin Zhang et al., Automatic generation of synthetic datasets from a city digital twin for use in the instance segmentation of building facades, Journal of Computational Design and Engineering

The article, “Automatic generation of synthetic datasets from a city digital twin for use in the instance segmentation of building facades,” was published in the Journal of Computational Design and Engineering at DOI: https://doi.org/10.1093/jcde/qwac086.


Previous Research

Related Links

SDGs

  • 04 Quality Education
  • 08 Decent Work and Economic Growth
  • 09 Industry, Innocation and Infrastructure
  • 11 Sustainable Cities and Communities
  • 15 Life on Land