Skip to main content
Log in

Dynamically throttleable neural networks

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Conditional computation for deep neural networks reduces overall computational load and improves model accuracy by running a subset of the network. In this work, we present a runtime dynamically throttleable neural network (DTNN) that can self-regulate its own performance target and computing resources by dynamically activating neurons in response to a single control signal, called utilization. We describe a generic formulation of throttleable neural networks (TNNs) by grouping and gating partial neural modules with various gating strategies. To directly optimize arbitrary application-level performance metrics and model complexity, a controller network is trained separately to predict a context-aware utilization via deep contextual bandits. Extensive experiments and comparisons on image classification and object detection tasks show that TNNs can be effectively throttled across a wide range of utilization settings, while having peak accuracy and lower cost that are comparable to corresponding vanilla architectures such as VGG, ResNet, ResNeXt, and DenseNet. We further demonstrate the effectiveness of the controller network on throttleable 3D convolutional networks (C3D) for video-based hand gesture recognition, which outperforms the vanilla C3D and all fixed utilization settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1251–1258 (2017)

  2. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510–4520 (2018)

  3. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856 (2018)

  4. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)

  5. Yang, L., Qi, Z., Liu, Z., Liu, H., Ling, M., Shi, L., et al.: An embedded implementation of CNN-based hand detection and orientation estimation algorithm. Mach. Vis. Appl. 30(6), 1071–1082 (2019)

    Article  Google Scholar 

  6. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1580–1589 (2020)

  7. Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.J., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)

  8. Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019). https://openreview.net/forum?id=S1eYHoC5FX

  9. Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., et al.: Mnasnet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2820–2828 (2019)

  10. Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019). https://openreview.net/forum?id=HylVB3AqYm

  11. Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once-for-all: train one network and specialize it for efficient deployment. In: International Conference on Learning Representations (ICLR) (2020). https://openreview.net/forum?id=HylxE1HKwS

  12. Yu, R., Li, A., Chen, C.F., Lai, J.H., Morariu, V.I., Han, X., et al.: Nisp: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9194–9203 (2018)

  13. Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11264–11272 (2019)

  14. He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4340–4349 (2019)

  15. Wu, H., Tang, Y., Zhang, X.: A pruning method based on the measurement of feature extraction ability. Mach. Vis. Appl. 32(1), 1–11 (2021)

    Article  Google Scholar 

  16. He, Y., Lin, J., Liu, Z., Wang, H., Li, L.J., Han, S.: Amc: automl for model compression and acceleration on mobile devices. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–800 (2018)

  17. Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1365–1374 (2019)

  18. Li, T., Wu, B., Yang, Y., Fan, Y., Zhang, Y., Liu, W.: Compressing convolutional neural networks via factorized convolutional filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3977–3986 (2019)

  19. Li, Y., Gu, S., Mayer, C., Gool, L.V., Timofte, R.: Group sparsity: The hinge between filter pruning and decomposition for network compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8018–8027 (2020)

  20. Hashemi, S., Anthony, N., Tann, H., Bahar, R.I., Reda, S.: Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), IEEE, pp. 1474–1479 (2017)

  21. Zhu, C., Han, S., Mao, H., Dally, W.J.: Trained ternary quantization. In: International Conference on Learning Representations (ICLR). OpenReview.net (2017). https://openreview.net/forum?id=S1_pAu9xl

  22. Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704–2713 (2018)

  23. Gong, R., Liu, X., Jiang, S., Li, T., Hu, P., Lin, J., et al.: Differentiable soft quantization: bridging full-precision and low-bit neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4852–4861 (2019)

  24. Dong, Z., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Hawq: Hessian aware quantization of neural networks with mixed-precision. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 293–302 (2019)

  25. Jimmy Ba, B.F.: Adaptive dropout for training deep neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 3084–3092 (2013)

  26. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  27. Riquelme, C., Tucker, G., Snoek, J.: Deep Bayesian bandits showdown: an empirical comparison of bayesian deep networks for Thompson sampling. In: International Conference on Learning Representations (ICLR) (2018). https://openreview.net/forum?id=SyYe6k-CW

  28. Liu, L., Deng, J.: Dynamic deep neural networks: optimizing accuracy-efficiency trade-offs by selective execution. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

  29. Spasov, P.L.: Dynamic neural network channel execution for efficient training. In: British machine vision conference (BMVC) (2019)

  30. Wu, Z., Nagarajan, T., Kumar, A., Rennie, S., Davis, L.S., Grauman, K., et al.: Blockdrop: dynamic inference paths in residual networks. In: Proceedings of the IEEE/CVF Conference on Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8817–8826 (2018)

  31. Chen, Z., Li, Y., Bengio, S., Si, S.: You look twice: GaterNet for dynamic filter selection in CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 9172–9180 (2019)

  32. Rao, Y., Lu, J., Lin, J., Zhou, J.: Runtime network routing for efficient image classification. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 41(10), 2291–2304 (2018)

    Article  Google Scholar 

  33. Veit, A., Belongie, S.: Convolutional networks with adaptive inference graphs. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–18 (2018)

  34. Han, Y., Huang, G., Song, S., Yang, L., Wang, H., Wang, Y.: Dynamic neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 30, 9345 (2021)

    Google Scholar 

  35. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-net: imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision (ECCV), Springer, pp. 525–542 (2016)

  36. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 6105–6114 (2019)

  37. Chen, H., Wang, Y., Xu, C., Shi, B., Xu, C., Tian, Q., et al.: AdderNet: Do we really need multiplications in deep learning? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1468–1477 (2020)

  38. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017). https://openreview.net/forum?id=r1Ue8Hcxg

  39. Wu, B., Dai, X., Zhang, P., Wang, Y., Sun, F., Wu, Y., et al.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10734–10742 (2019)

  40. Nayak, P., Zhang, D., Chai, S.: Bit efficient quantization for deep neural networks (2019). arXiv preprint arXiv:1910.04877

  41. Dinh, T., Melnikov, A., Daskalopoulos, V., Chai, S.: Subtensor quantization for mobilenets. In: Bartoli, A., Fusiello. A. (eds.), European Conference on Computer Vision Workshops (ECCVW), vol. 12539, Lecture Notes in Computer Science, Springer, pp. 126–130 (2020). https://doi.org/10.1007/978-3-030-68238-5_10

  42. Wiedemann, S., Müller, K.R., Samek, W.: Compact and computationally efficient representation of deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(3), 772–785 (2020)

    Article  MathSciNet  Google Scholar 

  43. Gysel, P., Pimentel, J., Motamedi, M., Ghiasi, S.: Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. (TNNLS) 29(11), 5784–5789 (2018). https://doi.org/10.1109/TNNLS.2018.2808319

    Article  Google Scholar 

  44. Ghamari, S., Ozcan, K., Dinh, T., Melnikov, A., Carvajal, J., Ernst, J., et al.: Quantization-guided training for compact TinyML models. CoRR (2021). arXiv:2103.06231

  45. Ahn, C., Kim, E., Oh, S.: Deep elastic networks with model selection for multi-task learning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 6529–6538 (2019)

  46. Yu, J, Yang, L., Xu, N., Yang, J., Huang, T.: Slimmable neural networks. In: International Conference on Learning Representations (ICLR) (2019). https://openreview.net/forum?id=H1gMCsAqY7

  47. Kim, E., Ahn, C., Oh, S.: NestedNet: Learning nested sparse structures in deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8669–8678 (2018)

  48. Wang, X., Yu, F., Dou, Z.Y., Darrell, T., Gonzalez, J.E.: Skipnet: learning dynamic routing in convolutional networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 409–424 (2018)

  49. Bengio, Y.: Deep learning of representations: looking forward. In: International Conference on Statistical Language and Speech Processing. Springer, pp. 1–37 (2013)

  50. Figurnov, M., Collins, M.D., Zhu, Y., Zhang, L., Huang, J., Vetrov, D., et al.: Spatially adaptive computation time for residual networks. In: Proceedings of the IEEE/CVF Conference on Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1790–1799 (2017)

  51. Teerapittayanon, S., McDanel, B., Kung, H.: Branchynet: Fast inference via early exiting from deep neural networks. In: International Conference on Pattern Recognition (ICPR), IEEE, pp. 2464–2469 (2016)

  52. Li, Z., Yang, Y., Liu, X., Zhou, F., Wen, S., Xu, W.: Dynamic computational time for visual attention. In: International Conference on Computer Vision Workshops (ICCVW), IEEE, pp. 1199–1209 (2017)

  53. Ruiz, A., Verbeek, J.: Adaptative inference cost with convolutional neural mixture models. In: The IEEE International Conference on Computer Vision (ICCV) (2019)

  54. Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q., Hinton, G., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In: International Conference on Learning Representations (ICLR) (2017). https://openreview.net/pdf?id=B1ckMDqlg

  55. Teja Mullapudi, R., Mark, W.R., Shazeer, N., Fatahalian, K.: Hydranets: specialized dynamic architectures for efficient inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8080–8089 (2018)

  56. Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: European Conference on Computer Vision (ECCV) (2016)

  57. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  58. Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning (ICML), pp. 4092–4101 (2018)

  59. Gao X, Zhao Y, Łukasz Dudziak, Mullins R, zhong Xu C. Dynamic channel pruning: feature boosting and suppression. In: International Conference on Learning Representations (ICLR) (2019). https://openreview.net/forum?id=BJxh2j0qYm

  60. Chen, Z., Xu, T.B., Du, C., Liu, C.L., He, H.: Dynamical channel pruning by conditional accuracy change for deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. (TNNLS) 32(2), 799–813 (2021). https://doi.org/10.1109/TNNLS.2020.2979517

    Article  Google Scholar 

  61. Odena, A., Lawson, D., Olah, C.: Changing model behavior at test-time using reinforcement learning. In: International Conference on Learning Representations Workshops (ICLRW) (2017)

  62. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  63. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE/CVF Conference on Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)

  64. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation (2013). arXiv preprint arXiv:1308.3432

  65. Peng, J., Bhanu, B.: Closed-loop object recognition using reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 20(2), 139–154 (1998)

    Article  Google Scholar 

  66. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: International Conference on Learning Representations (ICLR) (2017)

  67. Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1057–1063 (2000)

  68. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning (ICML), pp. 41–48 (2009)

  69. Tann, H., Hashemi, S., Bahar, R., Reda, S.: Runtime configurable deep neural networks for energy-accuracy trade-off. In: Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, ACM, p. 34 (2016)

  70. Ganapathy, S., Venkataramani, S., Sriraman, G., Ravindran, B., Raghunathan, A.: DyVEDeep: dynamic variable effort deep neural networks. ACM Trans. Embed. Comput. Syst. (TECS) 19(3), 1–24 (2020)

    Article  Google Scholar 

  71. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. (NeurIPS) 32, 8026–8037 (2019)

    Google Scholar 

  72. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)

  73. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE/CVF Conference on Conference on Computer Vision and Pattern Recognition (CVPR), pP. 5987–5995 (2017). https://github.com/facebookresearch/ResNeXt

  74. Krizhevsky, A., Hinton, G.: Learning multiple Layers of Features from Tiny Images. University of Toronto, Department of Computer Science (2009). https://www.cs.toronto.edu/~kriz/cifar.html

  75. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  76. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC2007) Results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  77. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NeurIPS) (2015)

  78. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497 (2015)

  79. twentybn.: The 20BN-jester Dataset V1. Version: 1.0. Accessed: 8.1.2019. https://20bn.com/datasets/jester

  80. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

  81. Kopuklu, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3d convolutional neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 0–0 (2019)

  82. Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent (2012)

  83. NVIDIA.: NVIDIA Jetson AGX Xavier module. https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit

  84. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)

    MATH  Google Scholar 

  85. Facebook.: fvcore. GitHub. https://github.com/facebookresearch/fvcore/blob/main/fvcore/nn/flop_count.py

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hengyue Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

1.1 A.1 FLOPs calculation

We use the codes from [85] for computing the FLOPs. Table 3 summarizes how to compute the FLOPs for some common layers:

For computing FLOPs of a convolutional layer in the TNN, we count how many activated kernels within each layer, and the FLOPs are calculated as 2 \(\times \) No. activated kernels \(\times \) kernel shape \(\times \) output shape. For fully connected layers, the input size will change, and the FLOPs are calculated as 2 \(\times \) activated input size \(\times \) output size.

Table 3 FLOPs calculation for common layers
Table 4 Detailed architectures of the DTNN for video-based hand gesture recognition on the Jester dataset
Table 5 Contextual controller architecture based on 3D-ShuffleNet

1.2 A.2 Detailed architectures

Table 4 presents the exact specification of the C3D-W used for hand gesture recognition. The conv-block consists of a 3D convolutional layer with ReLU activation. Stride of all convolutional layers is \(1 \times 1 \times 1\). Stride of max-pool-1 is \(1 \times 2 \times 2\), and other max-pool layers are \(2 \times 2 \times 2\). Padding of all conv-blocks is one, and padding of all max-pool layers is zero. In total, C3D-WN has 2.654G FLOPs of non-gated operations, and 94.752G FLOPs of throttleable operations. The detailed architecture of the controller is illustrated in Table 5. The network begins with a convolutional layer followed by 16 ShuffleNet units grouped into 3 stages (conv2_x to conv4_x). Each ShuffleNet unit is a residual block where the residual branch consists of one \(1 \times 1 \times 1\) group convolution, channel shuffle operation, \( 3 \times 3 \times 3\) depthwise convolution [1], and \(1 \times 1 \times 1\) group convolution. A cost comparison of the TNN and controller is shown in Table 6, indicating that the controller is much more computationally efficient.

1.3 A.3 Class distribution of 20BN-JESTER

The class distribution of the 20BN-JESTER training set is shown in Fig. 13.

Table 6 Cost comparison between the data path network (C3D-WN) and controller (3D-ShuffleNet)
Fig. 13
figure 13

Class distribution of 20BN-JESTER training set

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Parajuli, S., Hostetler, J. et al. Dynamically throttleable neural networks. Machine Vision and Applications 33, 59 (2022). https://doi.org/10.1007/s00138-022-01311-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-022-01311-z

Keywords

Navigation