Evaluation of Model Compression Techniques
Keywords:
Deep learning, model compression, knowledge distillation, quantization, network pruning, Neural Architecture Search (NAS)Abstract
The usage of AI including Machine Learning, Deep Learning, Computer Vision and Natural Language has been vastly increasing. Industries have been Data Dependent more than ever for their profits. However, the Storage and Processing has always been a problem and a hot topic for research in the field of data science. Deep Learning models use this large amount of data for success in diverse application. RNN, CNN, MLP (Multilayer Perceptron) have a complex structure, which requires high storage. A lot of research is done on boosting the accuracy and limiting the time complexity in the model while maintaining their powerful performance. Converting the model into its simpler form with respect to the initial model is the motive of model Compression. During critical situations where Memory storage and processing is a problem and also, it’s dependent on ML models model compression comes into picture.
Downloads
References
Cristian Buciluǎ, Rich Caruana, Alexandra Niculescu-Mizi, “Model Compression” ACM.
Tejalal Choudhary, Vipul Mishra, Anurag Goswami & Jagannathan Sarangapani, “A comprehensive survey on model compression and acceleration” SpringerLink.
Yu Cheng, Duo Wang, Pan Zhou, Tao Zhang, “A Survey of Model Compression and Acceleration for Deep Neural Networks” Cornell University.
Lei Deng; Guoqi Li; Song Han; Luping Shi; Yuan Xie “Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey,”IEEE
Antonio Polino, Razvan Pascanu, Dan Alistarh, “Model compression via distillation and quantization” Cornell University
Michael Zhu, Suyog Gupta, “To prune, or not to prune: exploring the efficacy of pruning for model compression” Cornell University
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han, “AMC: AutoML for Model Compression and Acceleration on Mobile Devices” CVF
Yangtze University, Jingzhou, “A Novel Deep Learning Model Compression Algorithm” Electronics
Ke Tan and DeLiang Wang, “Towards Model Compression for Deep Learning Based Speech Enhancement” IEEE
Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang,” Model Compression and Acceleration for Deep Neural Networks” IEEE