beam_search¶
- paddle.fluid.layers. beam_search ( pre_ids, pre_scores, ids, scores, beam_size, end_id, level=0, is_accumulated=True, name=None, return_parent_idx=False ) [source]
-
Beam search is a classical algorithm for selecting candidate words in a machine translation task.
Refer to Beam search for more details.
This operator only supports LoDTensor. It is used after finishing scores calculation to perform beam search for one time step. Specifically, after
ids
andscores
have been produced, it selects the top-K ( k isbeam_size
) candidate word ids of current step fromids
according to the correspondingscores
. Additionally,pre_id
andpre_scores
are the output of beam_search at previous step, they are needed for special use to handle ended candidate translations.Note that if
is_accumulated
is True, thescores
passed in should be accumulated scores. Otherwise, thescores
are considered as the probabilities of single step and would be transformed to the log field and added up withpre_scores
for final scores in this operator. Length penalty should be done with extra operators before calculating the accumulated scores if needed.Please see the following demo for a fully beam search usage example:
fluid/tests/book/test_machine_translation.py
- Parameters
-
pre_ids (Variable) – A LodTensor variable (lod level is 2), representing the selected ids of previous step. It is the output of beam_search at previous step. Its shape is [batch_size, 1] and its lod is [[0, 1, … , batch_size], [0, 1, …, batch_size]] at the first step. The data type should be int64.
pre_scores (Variable) – A LodTensor variable has the same shape and lod with
pre_ids
, representing the accumulated scores corresponding to the selected ids of previous step. It is the output of beam_search at previous step. The data type should be float32 or float64.ids (Variable|None) – A LodTensor variable containing the candidates ids. It has the same lod with
pre_ids
and its shape should be [batch_size * beam_size, K], where K supposed to be greater thanbeam_size
and the first dimension size (decrease as samples reach to the end) should be same as that ofpre_ids
. The data type should be int64. It can be None, which use index inscores
as ids.scores (Variable) – A LodTensor variable containing the accumulated scores corresponding to
ids
. Both its shape and lod are same as those ofids
. The data type should be float32 or float64.beam_size (int) – The beam width used in beam search.
end_id (int) – The id of end token.
level (int) – It can be ignored and mustn’t change currently. The 2 level lod used in this operator has the following meaning: The first level describes how many beams each sample has, which would change to 0 when beams of the sample all end (batch reduce); The second level describes how many times each beam is selected. Default 0, which shouldn’t be changed currently.
is_accumulated (bool) – Whether the input
score
is accumulated scores. Default True.name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
return_parent_idx (bool, optional) – Whether to return an extra Tensor variable in output, which stores the selected ids’ parent index in
pre_ids
and can be used to update RNN’s states by gather operator. Default False.
- Returns
-
- The tuple contains two or three LodTensor variables. The two LodTensor,
-
representing the selected ids and the corresponding accumulated scores of current step, have the same shape [batch_size, beam_size] and lod with 2 levels, and have data types int64 and float32. If
return_parent_idx
is True, an extra Tensor variable preserving the selected ids’ parent index is included, whose shape is [batch_size * beam_size] and data type is int64.
- Return type
-
tuple
Examples
import paddle.fluid as fluid import paddle paddle.enable_static() # Suppose `probs` contains predicted results from the computation # cell and `pre_ids` and `pre_scores` is the output of beam_search # at previous step. beam_size = 4 end_id = 1 pre_ids = fluid.data( name='pre_id', shape=[None, 1], lod_level=2, dtype='int64') pre_scores = fluid.data( name='pre_scores', shape=[None, 1], lod_level=2, dtype='float32') probs = fluid.data( name='probs', shape=[None, 10000], dtype='float32') topk_scores, topk_indices = fluid.layers.topk(probs, k=beam_size) accu_scores = fluid.layers.elementwise_add( x=fluid.layers.log(x=topk_scores), y=fluid.layers.reshape(pre_scores, shape=[-1]), axis=0) selected_ids, selected_scores = fluid.layers.beam_search( pre_ids=pre_ids, pre_scores=pre_scores, ids=topk_indices, scores=accu_scores, beam_size=beam_size, end_id=end_id)