tdm_sampler¶
- paddle.fluid.contrib.layers.nn. tdm_sampler ( x, neg_samples_num_list, layer_node_num_list, leaf_node_num, tree_travel_attr=None, tree_layer_attr=None, output_positive=True, output_list=True, seed=0, tree_dtype='int32', dtype='int32' ) [source]
-
Tdm Sampler According to the input positive samples at leaf node(x), do negative sampling layer by layer on the given tree. .. code-block:: text
- Given:
-
tree[[0], [1, 2], [3, 4], [5, 6]] # A binary tree with seven nodes travel_list = [[1, 3], [1, 4], [2, 5], [2, 6]] # leaf node’s travel path (exclude root node) layer_list = [[1, 2], [3, 4, 5, 6]] # two layer (exclude root node)
x = [[0], [1], [2], [3]] # Corresponding to leaf node [[3], [4], [5], [6]] neg_samples_num_list = [0, 0] # negative sample nums = 0 layer_node_num_list = [2, 4] leaf_node_num = 4 output_list = False
- we get:
-
out = [[1, 3], [1, 4], [2, 5], [2, 6]] labels = [[1, 1], [1, 1], [1, 1], [1, 1]] mask = [[1, 1], [1, 1], [1, 1], [1, 1]]
- Parameters
-
x (Variable) – Variable contained the item_id(corresponding to leaf node) information, dtype support int32/int64.
neg_samples_num_list (list(int)) – Number of negative samples per layer.
layer_node_num_list (list(int)) – Number of nodes per layer, must has same shape with neg_samples_num_list.
leaf_node_num (int) – Number of leaf nodes.
tree_travel_attr (ParamAttr) – To specify the tdm-travel parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr, should has shape (leaf_node_num, len(layer_node_num_list)), dtype support int32/int64.
tree_layer_attr (ParamAttr) – To specify the tdm-layer parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr, should has shape (node_num, 1), dtype support int32/int64.
output_positive (bool) – Whether to output positive samples (includ label and mask )at the same time.
output_list (bool) – Whether to divide the output into layers and organize it into list format.
seed (int) – The number of random seed.
tree_dtype (np.dtype|core.VarDesc.VarType|str) – The dtype of tdm-travel and tdm-layer, support int32/int64
dtype (np.dtype|core.VarDesc.VarType|str) – The dtype of output(sampling results, labels and masks)
- Returns
-
- A tuple including sampling results, corresponding labels and masks. if output_positive = True, sampling
-
result will include both positive and negative samples. If sampling reseult is a positive sample, the label is 1, and if it is a negative sample, it is 0. If the tree is unbalanced, in order to ensure the consistency of the sampling result shape, the padding sample’s mask = 0, the real sample’s mask value = 1. If output_list = True, the result will organize into list format specified by layer information. Output variable have same type with tdm-travel and tdm-layer parameter(tree_dtype).
- Return type
-
tuple
Examples
import paddle.fluid as fluid import numpy as np x = fluid.data(name=”x”, shape=[None, 1], dtype=”int32”, lod_level=1) travel_list = [[1, 3], [1, 4], [2, 5], [2, 6]] # leaf node’s travel path, shape(leaf_node_num, layer_num) layer_list_flat = [[1], [2], [3], [4], [5], [6]] # shape(node_nums, 1)
neg_samples_num_list = [0, 0] # negative sample nums = 0 layer_node_num_list = [2, 4] #two layer (exclude root node) leaf_node_num = 4
travel_array = np.array(travel_list) layer_array = np.array(layer_list_flat)
- sample, label, mask = fluid.contrib.layers.tdm_sampler(
-
x, neg_samples_num_list, layer_node_num_list, leaf_node_num, tree_travel_attr=fluid.ParamAttr(
- initializer=fluid.initializer.NumpyArrayInitializer(
-
travel_array)),
- tree_layer_attr=fluid.ParamAttr(
-
- initializer=fluid.initializer.NumpyArrayInitializer(
-
layer_array)),
output_positive=True, output_list=True, seed=0, tree_dtype=’int32’)
place = fluid.CPUPlace() exe = fluid.Executor(place) exe.run(fluid.default_startup_program()) xx = np.array([[0],[1]]).reshape((2,1)).astype(“int32”)
exe.run(feed={“x”:xx})