Negative Sampling

Uniform negative sampler

class torchkge.sampling.UniformNegativeSampler(kg, kg_val=None, kg_test=None, n_neg=1)[source]

Uniform negative sampler as presented in 2013 paper by Bordes et al.. Either the head or the tail of a triplet is replaced by another entity at random. The choice of head/tail is uniform. This class inherits from the torchkge.sampling.NegativeSampler interface. It then has its attributes as well.

References

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems 26, pages 2787–2795, 2013. https://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data

Parameters

kg (torchkge.data_structures.KnowledgeGraph) – Main knowledge graph (usually training one).
kg_val (torchkge.data_structures.KnowledgeGraph (optional)) – Validation knowledge graph.
kg_test (torchkge.data_structures.KnowledgeGraph (optional)) – Test knowledge graph.
n_neg (int) – Number of negative sample to create from each fact.

corrupt_batch(heads, tails, relations=None, n_neg=None)[source]

For each true triplet, produce a corrupted one not different from any other true triplet. If heads and tails are cuda objects , then the returned tensors are on the GPU.

Parameters

heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of heads of the relations in the current batch.
tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of tails of the relations in the current batch.
relations (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of relations in the current batch. This is optional here and mainly present because of the interface with other NegativeSampler objects.
n_neg (int (opt)) – Number of negative sample to create from each fact. It overwrites the value set at the construction of the sampler.

Returns

neg_heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled heads of the relations in the current batch.
neg_tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled tails of the relations in the current batch.

Bernoulli negative sampler

class torchkge.sampling.BernoulliNegativeSampler(kg, kg_val=None, kg_test=None, n_neg=1)[source]

Bernoulli negative sampler as presented in 2014 paper by Wang et al.. Either the head or the tail of a triplet is replaced by another entity at random. The choice of head/tail is done using probabilities taking into account profiles of the relations. See the paper for more details. This class inherits from the torchkge.sampling.NegativeSampler interface. It then has its attributes as well.

References

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge Graph Embedding by Translating on Hyperplanes. In Twenty-Eighth AAAI Conference on Artificial Intelligence, June 2014. https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531

Parameters

kg (torchkge.data_structures.KnowledgeGraph) – Main knowledge graph (usually training one).
kg_val (torchkge.data_structures.KnowledgeGraph (optional)) – Validation knowledge graph.
kg_test (torchkge.data_structures.KnowledgeGraph (optional)) – Test knowledge graph.
n_neg (int) – Number of negative sample to create from each fact.

bern_probs

Bernoulli sampling probabilities. See paper for more details.

Type: torch.Tensor, dtype: torch.float, shape: (kg.n_rel)

corrupt_batch(heads, tails, relations, n_neg=None)[source]

For each true triplet, produce a corrupted one assumed to be different from any other true triplet. If heads and tails are cuda objects, then the returned tensors are on the GPU.

Parameters

heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of heads of the relations in the current batch.
tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of tails of the relations in the current batch.
relations (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of relations in the current batch.
n_neg (int (opt)) – Number of negative sample to create from each fact. It overwrites the value set at the construction of the sampler.

Returns

neg_heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled heads of the relations in the current batch.
neg_tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled tails of the relations in the current batch.

evaluate_probabilities()[source]: Evaluate the Bernoulli probabilities for negative sampling as in the TransH original paper by Wang et al. (2014).

Positional negative sampler

class torchkge.sampling.PositionalNegativeSampler(kg, kg_val=None, kg_test=None)[source]

Positional negative sampler as presented in 2011 paper by Socher et al.. Either the head or the tail of a triplet is replaced by another entity chosen among entities that have already appeared at the same place in a triplet (involving the same relation). It is not clear in the paper how the choice of head/tail is done. We chose to use Bernoulli sampling as in 2014 paper by Wang et al. as we believe it serves the same purpose as the original paper. This class inherits from the torchkge.sampling.BernouilliNegativeSampler class seen as an interface. It then has its attributes as well.

References

Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In Advances in Neural Information Processing Systems 26, pages 926–934., 2013. https://nlp.stanford.edu/pubs/SocherChenManningNg_NIPS2013.pdf
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge Graph Embedding by Translating on Hyperplanes. In Twenty-Eighth AAAI Conference on Artificial Intelligence, June 2014. https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531

Parameters

kg (torchkge.data_structures.KnowledgeGraph) – Main knowledge graph (usually training one).
kg_val (torchkge.data_structures.KnowledgeGraph (optional)) – Validation knowledge graph.
kg_test (torchkge.data_structures.KnowledgeGraph (optional)) – Test knowledge graph.

possible_heads

keys : relations, values : list of possible heads for each relation.

Type: dict

possible_tails

keys : relations, values : list of possible tails for each relation.

Type: dict

n_poss_heads

List of number of possible heads for each relation.

Type: list

n_poss_tails

List of number of possible tails for each relation.

Type: list

corrupt_batch(heads, tails, relations, n_neg=None)[source]

For each true triplet, produce a corrupted one not different from any other golden triplet. If heads and tails are cuda objects, then the returned tensors are on the GPU.

Parameters

heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of heads of the relations in the current batch.
tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of tails of the relations in the current batch.
relations (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of relations in the current batch. This is optional here and mainly present because of the interface with other NegativeSampler objects.

Returns

neg_heads (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled heads of the relations in the current batch.
neg_tails (torch.Tensor, dtype: torch.long, shape: (batch_size)) – Tensor containing the integer key of negatively sampled tails of the relations in the current batch.

find_possibilities()[source]

For each relation of the knowledge graph (and possibly the validation graph but not the test graph) find all the possible heads and tails in the sense of Wang et al., e.g. all entities that occupy once this position in another triplet.

Returns

possible_heads (dict) – keys : relation index, values : list of possible heads
possible tails (dict) – keys : relation index, values : list of possible tails
n_poss_heads (torch.Tensor, dtype: torch.long, shape: (n_relations)) – Number of possible heads for each relation.
n_poss_tails (torch.Tensor, dtype: torch.long, shape: (n_relations)) – Number of possible tails for each relation.