ImperativePTQ¶
- class paddle.fluid.contrib.slim.quantization.imperative.ptq. ImperativePTQ ( quant_config=<paddle.fluid.contrib.slim.quantization.imperative.ptq_config.PTQConfig object> ) [source]
-
Static post training quantization.
-
quantize
(
model,
inplace=False,
fuse=False,
fuse_list=None
)
quantize¶
-
Add quant config and hook to the target layer.
- Parameters
-
model (paddle.nn.Layer) – The model to be quantized.
inplace (bool) – Whether apply quantization to the input model. Default: False.
fuse (bool) – Whether to fuse layers. Default: False.
fuse_list (list) – The layers’ names to be fused. For example, “fuse_list = [[“conv1”, “bn1”], [“conv2”, “bn2”]]”. A TypeError would be raised if “fuse” was set as True but “fuse_list” was None. Default: None.
- Return
-
quantized_model(paddle.nn.Layer): The quantized model.
-
save_quantized_model
(
model,
path,
input_spec=None,
**config
)
save_quantized_model¶
-
Convert the quantized model
Call jit.save to save the inference model
Post process the inference model.
- Parameters
-
model (Layer) – The model to be saved.
path (str) – The path prefix to save model. The format is
dirname/file_prefix
orfile_prefix
.input_spec (list[InputSpec|Tensor], optional) – Describes the input of the saved model’s forward method, which can be described by InputSpec or example Tensor. If None, all input variables of the original Layer’s forward method would be the inputs of the saved model. Default None.
**configs (dict, optional) – Other save configuration options for compatibility. We do not recommend using these configurations, they may be removed in the future. If not necessary, DO NOT use them. Default None. The following options are currently supported: (1) output_spec (list[Tensor]): Selects the output targets of the saved model. By default, all return variables of original Layer’s forward method are kept as the output of the saved model. If the provided
output_spec
list is not all output variables, the saved model will be pruned according to the givenoutput_spec
list.
- Returns
-
None
-
quantize
(
model,
inplace=False,
fuse=False,
fuse_list=None
)