Training models with Panoptic Segmentation in Detectron2
Tutorial on how to train your own models with panoptic segmentation in Detectron2.
14 May 2020Ask a question
Introduction
A paper [1] came out April last year describing a method combining semantic segmentation (assigning each pixel a class label) and instance segmentation (finding individual objects with its form and label). Detectron2 offers support for panoptic segmentation since last October and in this tutorial, we'll show how easy it is to train your own model with panoptic segmentation.
[1] Kirillov, Alexander et al. (2019). Panoptic Segmentation. arXiv:1801.00868v3
Prerequisites
We tested this tutorial on Ubuntu 18.04, but it should also work on other systems. The installations of the NVIDIA driver and required dependencies may deviate from the instructions below.
NVIDIA GPU
You need a CUDA-enabled graphic card with at least 11GB GPU memory, e.g. NVIDIA GeForce RTX 2080 Ti, because instance segmentation is extremely memory hungry.
NVIDIA Driver
If NVIDIA driver is not pre-installed, you can install it with sudo apt install nvidia-XXX
(XXX is the version, the newest one is 440
) if you are using Ubuntu or
download the appropriate NVIDIA driver (for Linux) and execute the binary as sudo.
CUDA
On Ubuntu 18.04, install CUDA 10.2 with the following script (from NVIDIA Developer):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
You find setup instructions for other systems on the NVIDIA Developer website.
Install Detectron2
Dependencies
The current version of Detectron2 requires
- Python ≥ 3.6
- PyTorch ≥ 1.4
On Ubuntu, run following lines in Bash (get pip with sudo apt install python3-pip
):
# Install PyTorch and other dependencies
pip install --user torch torchvision tensorboard cython
# Install OpenCV (optional)
sudo apt install python3-opencv
pip install --user opencv-python
# Install fvcore
pip install --user 'git+https://github.com/facebookresearch/fvcore'
# Install pycocotools
pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
Download and install Detectron2
In the newest version (0.1.2) of Detectron2, you need to set the environmental variable CUDA_HOME
to the location of the CUDA library. In Ubuntu, it is under /usr/local/cuda-XX.X/
.
export FORCE_CUDA="1"
export CUDA_HOME="/usr/local/cuda-10.2/"
git clone https://github.com/facebookresearch/detectron2
cd detectron2
pip install .
If you still encounter problems, check out the official installation guide.
Training the model
We base the tutorial on Detectron2 Beginner's Tutorial and train a balloon detector.
The setup for panoptic segmentation is very similar to instance segmentation. However, as in semantic segmentation, you have to tell Detectron2 the pixel-wise labelling of the whole image, e.g. using an image where the colours encode the labels.
# ...
record["height"] = height
record["width"] = width
# Pixel-wise segmentation
record["sem_seg_file_name"] = os.path.join(img_dir, "segmentation", v["filename"])
# ...
You can generate the mask images with the script provided for this demo.
If you want to visualise the dataset with Detectron's Visualizer
, add an empty list of stuff class. "Things" are well-defined countable objects,
while "stuff" is amorphous something with a different label than the background.
# ...
MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"], stuff_classes=[])
# ...
Otherwise Visualizer
complains:
AttributeError: Attribute 'stuff_classes' does not exist in the metadata of 'balloon_train'. Available keys are dict_keys(['name', 'thing_classes']).
Do your image datasets contain personal data like faces or license plates?
Try Celantur automated image and video blurring. Respect individuals' privacy, comply with data privacy laws and avoid hefty fines. Give it a try!Results
The training with the default settings takes a bit more than a minute on an NVIDIA Tesla V100 and requires about 9GiB GPU memory (instance segmentation training takes about 6 GiB). The resulting model does not necessarily perform any better than normal instance segmentation, which given the dataset and task (ballon detection) is no wonder.
However, if you want to train a model that can both detect instances and distinguish between different backgrounds, e.g. sky, ocean and sand on a beach, or street, houses and vegetation in a cityscape, then panoptic segmentation may be the right choice for you.
Parallelisation
Panoptimic segmenation, like semantic segmentation, is very memory hungry and you'll soon encounter the limits, e.g. if you increase the batch size (SOLVER.IMS_PER_BATCH) from 2 to 8:
RuntimeError: CUDA out of memory. Tried to allocate x.xx GiB (GPU 0; xx.xx GiB total capacity; xx.xx GiB already allocated; x.xx GiB free; xx.xx GiB reserved in total by PyTorch)
If you have multiple GPUs, you can use the handy function launch
provided by Detectron2 (in module detectron2.engine.launch
) to split the training up onto different GPUs:
launch(
train, # function to be parallelised across multiple GPUs
4, # Numer of GPUs per machine
num_machines=1,
machine_rank=0,
dist_url="tcp://127.0.0.1:1234",
args=(cfg,), # arguments to the function `train'
)
📌 You find the scripts from this tutorial also in our GitHub repo.
Expert in image processing optimization?
Take part of our performance engineering technical challenge and win € 150 Amazon voucher! Terms and conditions here.Ask us Anything. We'll get back to you shortly