Evaluation of Model Compression Techniques

Joel Randive; Shivam Satav; Vipul Doiphode; Omkar Sawant; Dr. Sachin Bhoite

Evaluation of Model Compression Techniques

Authors

Joel Randive
Shivam Satav
Vipul Doiphode
Omkar Sawant
Dr. Sachin Bhoite

Keywords:

Deep learning, model compression, knowledge distillation, quantization, network pruning, Neural Architecture Search (NAS)

Abstract

The usage of AI including Machine Learning, Deep Learning, Computer Vision and Natural Language has been vastly increasing. Industries have been Data Dependent more than ever for their profits. However, the Storage and Processing has always been a problem and a hot topic for research in the field of data science. Deep Learning models use this large amount of data for success in diverse application. RNN, CNN, MLP (Multilayer Perceptron) have a complex structure, which requires high storage. A lot of research is done on boosting the accuracy and limiting the time complexity in the model while maintaining their powerful performance. Converting the model into its simpler form with respect to the initial model is the motive of model Compression. During critical situations where Memory storage and processing is a problem and also, it’s dependent on ML models model compression comes into picture.

Downloads

Download data is not yet available.

References

Cristian Buciluǎ, Rich Caruana, Alexandra Niculescu-Mizi, “Model Compression” ACM.

Tejalal Choudhary, Vipul Mishra, Anurag Goswami & Jagannathan Sarangapani, “A comprehensive survey on model compression and acceleration” SpringerLink.

Yu Cheng, Duo Wang, Pan Zhou, Tao Zhang, “A Survey of Model Compression and Acceleration for Deep Neural Networks” Cornell University.

Lei Deng; Guoqi Li; Song Han; Luping Shi; Yuan Xie “Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey,”IEEE

Antonio Polino, Razvan Pascanu, Dan Alistarh, “Model compression via distillation and quantization” Cornell University

Michael Zhu, Suyog Gupta, “To prune, or not to prune: exploring the efficacy of pruning for model compression” Cornell University

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han, “AMC: AutoML for Model Compression and Acceleration on Mobile Devices” CVF

Yangtze University, Jingzhou, “A Novel Deep Learning Model Compression Algorithm” Electronics

Ke Tan and DeLiang Wang, “Towards Model Compression for Deep Learning Based Speech Enhancement” IEEE

Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang,” Model Compression and Acceleration for Deep Neural Networks” IEEE

Evaluation of Model Compression Techniques