QuantizedMatmul¶
- class paddle.nn.quant.quant_layers. QuantizedMatmul ( layer: Layer | None = None, weight_bits: int = 8, activation_bits: int = 8, moving_rate: float = 0.9, weight_quantize_type: _QuantType = 'abs_max', activation_quantize_type: _QuantType = 'abs_max', weight_pre_layer: Layer | None = None, act_pre_layer: Layer | None = None, weight_quant_layer: Layer | None = None, act_quant_layer: Layer | None = None ) [source]
-
The computational logic of QuantizedMatmul is the same with Matmul. The only difference is that its inputs are all fake quantized.
-
forward
(
x: Tensor,
y: Tensor,
transpose_x: bool = False,
transpose_y: bool = False,
name: str | None = None
)
Tensor
forward¶
-
Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
-
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
-
forward
(
x: Tensor,
y: Tensor,
transpose_x: bool = False,
transpose_y: bool = False,
name: str | None = None
)
Tensor