Setting up and Running StyleGAN2
A short tutorial on setting up StyleGAN2 including troubleshooting.
29 July 2020Ask a question
Introduction
At Celantur, we use deep learning to anonymise objects in images and videos for data protection. We often share insights from our work in this blog, like how to Dockerise CUDA or how to do Panoptic Segmentation in Detectron2.
In this blog post, we want to guide you through setting up StyleGAN2[1] from NVIDIA Research, a synthetic image generator.
[1] Karras T. (2020). Analyzing and Improving the Image Quality of StyleGAN. arXiv:1912.04958
Prerequisites
- We tested this tutorial on Ubuntu 18.04, but it should also work on other systems.
- You need a CUDA-enabled graphic card with at least 16GB GPU memory, e.g. NVIDIA Tesla V100.
- StyleGAN2 requires older version of CUDA (v10.0) and TensorFlow (v.1.14 - v1.15) to run.
Setting up CUDA Toolkit 10.0
On Ubuntu 18.04, install CUDA 10.0 with the following script (from NVIDIA Developer):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get install cuda-10-0
The latest NVIDIA driver nvidia-driver-450
is a transient dependency of the package cuda-10-0
and will be automatically installed.
You can set up CUDA 10.0 in parallel with newer CUDA versions, which are installed in /usr/local/cuda-xx-x/
.
⚠️ NOTE:
/usr/local/cuda
links to the latest installed version.
So if you want to use StyleGAN2 in parallel with a different framework, e.g. Detectron2, that requires CUDA 10.2+, be careful
to set the environmental variable CUDA_HOME
correctly.
You find setup instructions for other systems on the NVIDIA Developer website.
Install TensorFlow 1.15
You need an older version of TensorFlow (v1.15) and Python (v3.6) to run StyleGAN2. I highly recommend making use of a package management system like conda, so that you can operate different Python and TensorFlow versions on the same OS.
Once conda is installed, you can set up a new Python3.6 environment named "stylegan2" with
conda create -n stylegan2 python==3.6.9
# and activates it
conda activate stylegan2`.
Install GPU-capable TensorFlow and StyleGAN's dependencies:
pip install scipy==1.3.3 requests==2.22.0 Pillow==6.2.1
pip install tensorflow-gpu==1.15.3
⚠️ IMPORTANT: If you install the CPU-only TensorFlow (without
-gpu
), StyleGAN2 will not find your GPU notwithstanding properly installed CUDA toolkit and GPU driver.
Set up StyleGAN2
Download StyleGAN2 from Github:
git clone https://github.com/NVlabs/stylegan2.git
NVCC
Test that NVCC — required for compiling TensorFlow ops — runs properly. NVCC comes with your CUDA installation, so don't install any extra packages!
It resides in /usr/local/cuda/bin
and it's the best to add this directory to your PATH
in .bashrc
:
echo 'export PATH=/usr/local/cuda/bin:$PATH' >>~/.bashrc
Restart the Bash session and run in the folder stylegan2
:
nvcc test_nvcc.cu -o test_nvcc -run
# CORRECT OUTPUT:
# CPU says hello.
# GPU says hello.
Image Synthesis
Use pre-trained networks to generate some synthetic faces:
python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl \
--seeds=6600-6625 --truncation-psi=0.5
Do your image datasets contain personal data like faces or license plates?
Try Celantur automated image and video blurring. Respect individuals' privacy, comply with data privacy laws and avoid hefty fines. Give it a try!Transform existing images
If you want to transform existing images, you need to prepare them beforehand.
In this tutorial, we use a pre-trained network for portait photos which requires an image format of 1024x1024 (generally the resolution must be a power of 2), thus first convert your portraits that you want to manipulate into that format.
- You can use the ImageMagick tool
convert
:
convert input.jpeg -resize 1024x1024 input-resized.jpeg
- Then generate the TFRecords:
# datasets/images: Source directory of the images in JPEG or PNG.
# datasets/tfrecords: Output directory for the TFRecords.
python dataset_tool.py create_from_images datasets/tfrecords/ datasets/images/
⚠️ IMPORTANT: Image must not contain an alpha channel!
When using PNG format, be careful that the images do not include transparency, which requires an additional alpha channel. StyleGAN2 accepts images with only one color channel (grayscale) or three channels (RGB).
- Projection to latent space.
Use the TFRecords for the projection to latent space.
# --data-dir: root dir of datasets.
# --dataset: subdirectory where the TFRecords are stored.
python run_projector.py project-real-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --dataset=tfrecords --data-dir=datasets --num-images 3
The parameter --num-images
is 3 by default. If you have fewer images in your dataset, you'll get an OutOfRangeError
.
The script saves snapshot images to results
during the projection process and you can see how it converges to the original.
You can adapt the hyperparameters in the constructor of projector.py
, e.g. number of training steps and learning rate.
If you want to save the representation in the latent space as well, add the following line to run_projector.py
:
def project_image(proj, targets, png_prefix, num_snapshots):
# ... #
while proj.get_cur_step() < proj.num_steps:
# ... #
if proj.get_cur_step() in snapshot_steps:
# ADD THE LINE BELOW TO PICKLE THE LATENT REPRESENTATION
misc.save_pkl(proj.get_dlatents(), png_prefix + 'step%04d.pkl' % proj.get_cur_step())
# ADD THE LINE ABOVE TO PICKLE THE LATENT REPRESENTATION
misc.save_image_grid(proj.get_images(), png_prefix + 'step%04d.png' % proj.get_cur_step(), drange=[-1,1])
You'll get the representation as a pickled NumPy array, which you can use to modify your original picture.
Machine Learning at Celantur
If you find a bug in this tutorial or are interested in creating state-of-the-art ML models and deploying them in a high-availability and high-scalability cloud environment, drop us a short message and have a chat with us!
Troubleshooting
- Problem:
nvcc
does not work properly.
Solution: It depends on the config filenvcc.profile
and other executables in/usr/local/cuda-xx-x/bin
. Thus a symbolic link tonvcc
in~/.local/bin
won't work. Add/usr/local/cuda-xx-x/bin
to yourPATH
. - Problem: I have installed CUDA Toolkit and the NVIDIA driver.
nvcc
works,nvidia-smi
shows the correct GPU. Why does Tensorflow complain that it cannot find the GPU?
Solution: Did you install the packagetensorflow-gpu
?tensorflow
is the CPU-only version. - Problem: When I try to generate the TFRcords, it complains, "Input images must be stored as RGB or grayscale"
Solution: Remove the alpha channel and transparency, e.g. convert a PNG to JPEG. - Problem:
OutOfRangeError
during projection into latent space.
Solution: Explicitly set the parameter--num-images
to the number of images in the dataset.
Expert in image processing optimization?
Take part of our performance engineering technical challenge and win € 150 Amazon voucher! Terms and conditions here .Ask us Anything. We'll get back to you shortly