tdm_sampler

paddle.fluid.contrib.layers.nn. tdm_sampler ( x, neg_samples_num_list, layer_node_num_list, leaf_node_num, tree_travel_attr=None, tree_layer_attr=None, output_positive=True, output_list=True, seed=0, tree_dtype='int32', dtype='int32' ) [source]

Tdm Sampler According to the input positive samples at leaf node(x), do negative sampling layer by layer on the given tree. .. code-block:: text

Given:

tree[[0], [1, 2], [3, 4], [5, 6]] # A binary tree with seven nodes travel_list = [[1, 3], [1, 4], [2, 5], [2, 6]] # leaf node’s travel path (exclude root node) layer_list = [[1, 2], [3, 4, 5, 6]] # two layer (exclude root node)

x = [[0], [1], [2], [3]] # Corresponding to leaf node [[3], [4], [5], [6]] neg_samples_num_list = [0, 0] # negative sample nums = 0 layer_node_num_list = [2, 4] leaf_node_num = 4 output_list = False

we get:

out = [[1, 3], [1, 4], [2, 5], [2, 6]] labels = [[1, 1], [1, 1], [1, 1], [1, 1]] mask = [[1, 1], [1, 1], [1, 1], [1, 1]]

Parameters
  • x (Variable) – Variable contained the item_id(corresponding to leaf node) information, dtype support int32/int64.

  • neg_samples_num_list (list(int)) – Number of negative samples per layer.

  • layer_node_num_list (list(int)) – Number of nodes per layer, must has same shape with neg_samples_num_list.

  • leaf_node_num (int) – Number of leaf nodes.

  • tree_travel_attr (ParamAttr) – To specify the tdm-travel parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr, should has shape (leaf_node_num, len(layer_node_num_list)), dtype support int32/int64.

  • tree_layer_attr (ParamAttr) – To specify the tdm-layer parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr, should has shape (node_num, 1), dtype support int32/int64.

  • output_positive (bool) – Whether to output positive samples (includ label and mask )at the same time.

  • output_list (bool) – Whether to divide the output into layers and organize it into list format.

  • seed (int) – The number of random seed.

  • tree_dtype (np.dtype|core.VarDesc.VarType|str) – The dtype of tdm-travel and tdm-layer, support int32/int64

  • dtype (np.dtype|core.VarDesc.VarType|str) – The dtype of output(sampling results, labels and masks)

Returns

A tuple including sampling results, corresponding labels and masks. if output_positive = True, sampling

result will include both positive and negative samples. If sampling reseult is a positive sample, the label is 1, and if it is a negative sample, it is 0. If the tree is unbalanced, in order to ensure the consistency of the sampling result shape, the padding sample’s mask = 0, the real sample’s mask value = 1. If output_list = True, the result will organize into list format specified by layer information. Output variable have same type with tdm-travel and tdm-layer parameter(tree_dtype).

Return type

tuple

Examples


System Message: WARNING/2 (/usr/local/lib/python3.8/site-packages/paddle/fluid/contrib/layers/nn.py:docstring of paddle.fluid.contrib.layers.nn.tdm_sampler, line 61)

Explicit markup ends without a blank line; unexpected unindent.

import paddle.fluid as fluid import numpy as np x = fluid.data(name=”x”, shape=[None, 1], dtype=”int32”, lod_level=1) travel_list = [[1, 3], [1, 4], [2, 5], [2, 6]] # leaf node’s travel path, shape(leaf_node_num, layer_num) layer_list_flat = [[1], [2], [3], [4], [5], [6]] # shape(node_nums, 1)

neg_samples_num_list = [0, 0] # negative sample nums = 0 layer_node_num_list = [2, 4] #two layer (exclude root node) leaf_node_num = 4

travel_array = np.array(travel_list) layer_array = np.array(layer_list_flat)

sample, label, mask = fluid.contrib.layers.tdm_sampler(

x, neg_samples_num_list, layer_node_num_list, leaf_node_num, tree_travel_attr=fluid.ParamAttr(

System Message: ERROR/3 (/usr/local/lib/python3.8/site-packages/paddle/fluid/contrib/layers/nn.py:docstring of paddle.fluid.contrib.layers.nn.tdm_sampler, line 80)

Unexpected indentation.

initializer=fluid.initializer.NumpyArrayInitializer(

travel_array)),

System Message: WARNING/2 (/usr/local/lib/python3.8/site-packages/paddle/fluid/contrib/layers/nn.py:docstring of paddle.fluid.contrib.layers.nn.tdm_sampler, line 82)

Block quote ends without a blank line; unexpected unindent.

tree_layer_attr=fluid.ParamAttr(
initializer=fluid.initializer.NumpyArrayInitializer(

layer_array)),

System Message: WARNING/2 (/usr/local/lib/python3.8/site-packages/paddle/fluid/contrib/layers/nn.py:docstring of paddle.fluid.contrib.layers.nn.tdm_sampler, line 85)

Definition list ends without a blank line; unexpected unindent.

output_positive=True, output_list=True, seed=0, tree_dtype=’int32’)

place = fluid.CPUPlace() exe = fluid.Executor(place) exe.run(fluid.default_startup_program()) xx = np.array([[0],[1]]).reshape((2,1)).astype(“int32”)

exe.run(feed={“x”:xx})