The Influence of Model Architecture on Multi-GPU Training of Adversarial Networks

Garimella, Sarvesh; Garimella, Sarvesh

Graphical processing units (GPUs) provide significant computational power for artificial intelligence (AI) applications using convolutional neural networks (CNNs). The TensorFlow Python API provides flexibility for testing various architectures across a variety of platforms, including GPUs. Despite the advantages of CNN training on GPUs, large multidimensional environmental datasets are often still prohibitively costly for model training. This study explores how model architecture influences training cost of generative adversarial networks (GANs). Specifically explored are different choices for distribution of instructions across multiple GPUs, generation of multiple examples from a single training example, and synchronous vs. asynchronous updates to model loss terms.

J4.1 The Influence of Model Architecture on Multi-GPU Training of Adversarial Networks