"Speed/accuracy trade-offs for modern convolutional object detectors" by Huang et al (2017). The aim of this paper is to evaluate modern convolutional object detection systems where the metrics is speed and accuracy. Separation between "meta-architectures" (Faster R-CNN, R-FCN and SSD) and base feature extractors (VGG, Residual Networks, Inception) are made to evaluate their different combinations. The paper uses three meta-architectures namely Single Shot Detector (SSD), Faster R-CNN and R-FCN. See the following posts to get familiar with these object detection algorithms. For these meta-architectures six feature extractors are considered and they are VGG-16. Resnet-101, Inception v2, Inception v3, Inception Resnet and MobileNet. Other architecture configurations such as the number of proposals and output stride settings for Resnet and Inception Resnet are also set up in a reasonable way (see the detail in the paper :P). Furthermore the loss function configura