Powered by GitBook

参考文档

https://www.zhihu.com/question/40503595/answer/161934005 现在的深度学习的模型越来越大，有个结论是说，大脑的激活是非常稀疏的，对模型参数有什么好的办法压缩吗?
http://blog.csdn.net/zyazky/article/details/52932797 网络压缩-量化方法对比
https://yq.aliyun.com/articles/230662?spm=5176.100239.bloglist.12.c02jCq BNN - 基于low-bits量化压缩的跨平台深度学习框架
Deep Learning with Limited Numerical Precision
https://github.com/google/gemmlowp
http://fjdu.github.io/machine/learning/2016/07/07/quantize-neural-networks-with-tensorflow.html 用 TensorFlow 压缩神经网络

权重、激活和梯度

KL散度(相对熵)

熵值法(Entropy method)

范数

线性量化

TensorRT采用的Calibration操作是为了最大化利用有限的量化bit位

舍入模式

多种网络的量化精度，见Deep Learning with Low Precision by Half-wave Gaussian Quantization.P9.Table 7

一般情况下，量化不涉及到CNN网络的第一层和最后一层，因为这样做会对识别精度造成显著的影响。

results matching ""

No results matching ""