PTQ¶
- class paddle.quantization. PTQ ( config: paddle.quantization.config.QuantConfig ) [source]
-
Applying post training quantization to the model.
-
quantize
(
model: paddle.nn.layer.layers.Layer,
inplace=False
)
[source]
quantize¶
-
Create a model for post-training quantization.
The quantization configuration will be propagated in the model. And it will insert observers into the model to collect and compute quantization parameters.
- Parameters
-
model (Layer) –
inplace (bool) –
Return: The prepared model for post-training quantization.
Examples: .. code-block:: python
from paddle.quantization import PTQ, QuantConfig from paddle.quantization.observers import AbsmaxObserver from paddle.vision.models import LeNet
observer = AbsmaxObserver() q_config = QuantConfig(activation=observer, weight=observer) ptq = PTQ(q_config) model = LeNet() model.eval() quant_model = ptq.quantize(model) print(quant_model)
-
convert
(
model: paddle.nn.layer.layers.Layer,
inplace=False
)
convert¶
-
Convert the quantization model to onnx style. And the converted model can be saved as inference model by calling paddle.jit.save. :param model: :type model: Layer :param inplace: :type inplace: bool
Return: The converted model
Examples: .. code-block:: python
import paddle from paddle.quantization import QAT, QuantConfig from paddle.quantization.quanters import FakeQuanterWithAbsMaxObserver from paddle.vision.models import LeNet
quanter = FakeQuanterWithAbsMaxObserver(moving_rate=0.9) q_config = QuantConfig(activation=quanter, weight=quanter) qat = QAT(q_config) model = LeNet() quantized_model = qat.quantize(model) converted_model = qat.convert(quantized_model) dummy_data = paddle.rand([1, 1, 32, 32], dtype=”float32”) paddle.jit.save(converted_model, “./quant_deploy”, [dummy_data])
-
quantize
(
model: paddle.nn.layer.layers.Layer,
inplace=False
)
[source]