beam_search¶

paddle.fluid.layers. beam_search ( pre_ids, pre_scores, ids, scores, beam_size, end_id, level=0, is_accumulated=True, name=None, return_parent_idx=False ) [source]

Beam search is a classical algorithm for selecting candidate words in a machine translation task.

Refer to Beam search for more details.

This operator only supports LoDTensor. It is used after finishing scores calculation to perform beam search for one time step. Specifically, after ids and scores have been produced, it selects the top-K ( k is beam_size ) candidate word ids of current step from ids according to the corresponding scores. Additionally, pre_id and pre_scores are the output of beam_search at previous step, they are needed for special use to handle ended candidate translations.

Note that if is_accumulated is True, the scores passed in should be accumulated scores. Otherwise, the scores are considered as the probabilities of single step and would be transformed to the log field and added up with pre_scores for final scores in this operator. Length penalty should be done with extra operators before calculating the accumulated scores if needed.

Please see the following demo for a fully beam search usage example:

fluid/tests/book/test_machine_translation.py

Parameters

pre_ids (Variable) – A LodTensor variable (lod level is 2), representing the selected ids of previous step. It is the output of beam_search at previous step. Its shape is [batch_size, 1] and its lod is [[0, 1, … , batch_size], [0, 1, …, batch_size]] at the first step. The data type should be int64.
pre_scores (Variable) – A LodTensor variable has the same shape and lod with pre_ids , representing the accumulated scores corresponding to the selected ids of previous step. It is the output of beam_search at previous step. The data type should be float32 or float64.
ids (Variable|None) – A LodTensor variable containing the candidates ids. It has the same lod with pre_ids and its shape should be [batch_size * beam_size, K], where K supposed to be greater than beam_size and the first dimension size (decrease as samples reach to the end) should be same as that of pre_ids . The data type should be int64. It can be None, which use index in scores as ids.
scores (Variable) – A LodTensor variable containing the accumulated scores corresponding to ids . Both its shape and lod are same as those of ids . The data type should be float32 or float64.
beam_size (int) – The beam width used in beam search.
end_id (int) – The id of end token.
level (int) – It can be ignored and mustn’t change currently. The 2 level lod used in this operator has the following meaning: The first level describes how many beams each sample has, which would change to 0 when beams of the sample all end (batch reduce); The second level describes how many times each beam is selected. Default 0, which shouldn’t be changed currently.
is_accumulated (bool) – Whether the input score is accumulated scores. Default True.
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
return_parent_idx (bool, optional) – Whether to return an extra Tensor variable in output, which stores the selected ids’ parent index in pre_ids and can be used to update RNN’s states by gather operator. Default False.

Returns

The tuple contains two or three LodTensor variables. The two LodTensor,: representing the selected ids and the corresponding accumulated scores of current step, have the same shape [batch_size, beam_size] and lod with 2 levels, and have data types int64 and float32. If return_parent_idx is True, an extra Tensor variable preserving the selected ids’ parent index is included, whose shape is [batch_size * beam_size] and data type is int64.

Return type

tuple

Examples

import paddle.fluid as fluid
import paddle
paddle.enable_static()

# Suppose `probs` contains predicted results from the computation
# cell and `pre_ids` and `pre_scores` is the output of beam_search
# at previous step.
beam_size = 4
end_id = 1
pre_ids = fluid.data(
    name='pre_id', shape=[None, 1], lod_level=2, dtype='int64')
pre_scores = fluid.data(
    name='pre_scores', shape=[None, 1], lod_level=2, dtype='float32')
probs = fluid.data(
    name='probs', shape=[None, 10000], dtype='float32')
topk_scores, topk_indices = fluid.layers.topk(probs, k=beam_size)
accu_scores = fluid.layers.elementwise_add(
    x=fluid.layers.log(x=topk_scores),
    y=fluid.layers.reshape(pre_scores, shape=[-1]),
    axis=0)
selected_ids, selected_scores = fluid.layers.beam_search(
    pre_ids=pre_ids,
    pre_scores=pre_scores,
    ids=topk_indices,
    scores=accu_scores,
    beam_size=beam_size,
    end_id=end_id)