EditDistance¶
- class paddle.fluid.metrics. EditDistance ( name ) [source]
-
This API is for the management of edit distances. Editing distance is a method to quantify the degree of dissimilarity between two strings, such as words, by calculating the minimum editing operand (add, delete or replace) required to convert one string into another. Refer to https://en.wikipedia.org/wiki/Edit_distance.
- Parameters
-
name (str, optional) – Metric name. For details, please refer to Name. Default is None.
Examples
import paddle.fluid as fluid import numpy as np # suppose that batch_size is 128 batch_size = 128 # init the edit distance manager distance_evaluator = fluid.metrics.EditDistance("EditDistance") # generate the edit distance across 128 sequence pairs, the max distance is 10 here edit_distances_batch0 = np.random.randint(low = 0, high = 10, size = (batch_size, 1)) seq_num_batch0 = batch_size distance_evaluator.update(edit_distances_batch0, seq_num_batch0) avg_distance, wrong_instance_ratio = distance_evaluator.eval() print("the average edit distance for batch0 is %.2f and the wrong instance ratio is %.2f " % (avg_distance, wrong_instance_ratio)) edit_distances_batch1 = np.random.randint(low = 0, high = 10, size = (batch_size, 1)) seq_num_batch1 = batch_size distance_evaluator.update(edit_distances_batch1, seq_num_batch1) avg_distance, wrong_instance_ratio = distance_evaluator.eval() print("the average edit distance for batch0 and batch1 is %.2f and the wrong instance ratio is %.2f " % (avg_distance, wrong_instance_ratio)) distance_evaluator.reset() edit_distances_batch2 = np.random.randint(low = 0, high = 10, size = (batch_size, 1)) seq_num_batch2 = batch_size distance_evaluator.update(edit_distances_batch2, seq_num_batch2) avg_distance, wrong_instance_ratio = distance_evaluator.eval() print("the average edit distance for batch2 is %.2f and the wrong instance ratio is %.2f " % (avg_distance, wrong_instance_ratio))
-
update
(
distances,
seq_num
)
update¶
-
Update the overall edit distance
- Parameters
-
distances (numpy.array) – a (batch_size, 1) numpy.array, each element represents the edit distance between two sequences.
seq_num (int|float) – standing for the number of sequence pairs.
-
eval
(
)
eval¶
-
Return two floats: avg_distance: the average distance for all sequence pairs updated using the update function. avg_instance_error: the ratio of sequence pairs whose edit distance is not zero.
-
get_config
(
)
get_config¶
-
Get the metric and current states. The states are the members who do not has “_” prefix.
- Parameters
-
None –
- Returns
-
a python dict, which contains the inner states of the metric instance
- Return types:
-
a python dict
-
reset
(
)
reset¶
-
reset function empties the evaluation memory for previous mini-batches.
- Parameters
-
None –
- Returns
-
None
- Return types:
-
None