weight_dequantize¶
- paddle.nn.quant. weight_dequantize ( x, scale, algo='weight_only_int8', out_dtype='float16', group_size=- 1 ) [source]
-
Dequantization function for weight_only and llm.int8’s weight.
- Parameters
-
x (Tensor) – The input Tensor to be dequantized, the data type is int8.
scale (Tensor) – The scale Tensor which is the output of weight_quantize, the data type is float32.
algo (str) – The algo that is x will be apply, must be one of ‘weight_only_int8’, ‘weight_only_int4’ and ‘llm.int8’, default: ‘weight_only_int8’.
out_dtype (str|np.dtype) – The output Tensor’s data type, must be one of ‘float16’ and ‘bfloat16’, default: ‘float16’.
- Returns
-
The Tensor which is the dequantitative results, the data type is float16 or bfloat16, the shape is transposition of x.
- Return type
-
out (Tensor)
Examples
>>> >>> import paddle >>> from paddle.nn.quant import weight_quantize, weight_dequantize >>> paddle.seed(2023) >>> x = paddle.rand(shape=[64, 32], dtype=paddle.float16) >>> out, scale = weight_quantize(x, algo='weight_only_int8') >>> x_dequant = weight_dequantize(out, scale)