Variational Inference
Contents
Variational Inference#
DependencyMFVI#
- class supar.structs.vi.DependencyMFVI(max_iter: int = 3)[source]#
Mean Field Variational Inference for approximately calculating marginals of dependency trees Wang & Tu (2020).
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of three tensors s_arc and s_sib. s_arc (
[batch_size, seq_len, seq_len]
) holds scores of all possible dependent-head pairs. s_sib ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-sibling triples.mask (BoolTensor) –
[batch_size, seq_len]
. The mask to avoid aggregation on padding tokens.target (LongTensor) –
[batch_size, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type
DependencyLBP#
- class supar.structs.vi.DependencyLBP(max_iter: int = 3)[source]#
Loopy Belief Propagation for approximately calculating marginals of dependency trees Smith & Eisner (2008).
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of three tensors s_arc and s_sib. s_arc (
[batch_size, seq_len, seq_len]
) holds scores of all possible dependent-head pairs. s_sib ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-sibling triples.mask (BoolTensor) –
[batch_size, seq_len]
. The mask to avoid aggregation on padding tokens.target (LongTensor) –
[batch_size, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type
ConstituencyMFVI#
- class supar.structs.vi.ConstituencyMFVI(max_iter: int = 3)[source]#
Mean Field Variational Inference for approximately calculating marginals of constituent trees.
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of two tensors s_span and s_pair. s_span (
[batch_size, seq_len, seq_len]
) holds scores of all possible spans. s_pair ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of second-order triples.mask (BoolTensor) –
[batch_size, seq_len, seq_len]
. The mask to avoid aggregation on padding tokens.target (BoolTensor) –
[batch_size, seq_len, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type
ConstituencyLBP#
- class supar.structs.vi.ConstituencyLBP(max_iter: int = 3)[source]#
Loopy Belief Propagation for approximately calculating marginals of constituent trees.
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of four tensors s_edge, s_sib, s_cop and s_grd. s_span (
[batch_size, seq_len, seq_len]
) holds scores of all possible spans. s_pair ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of second-order triples.mask (BoolTensor) –
[batch_size, seq_len, seq_len]
. The mask to avoid aggregation on padding tokens.target (BoolTensor) –
[batch_size, seq_len, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type
SemanticDependencyMFVI#
- class supar.structs.vi.SemanticDependencyMFVI(max_iter: int = 3)[source]#
Mean Field Variational Inference for approximately calculating marginals of semantic dependency trees Wang et al. (2019).
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of four tensors s_edge, s_sib, s_cop and s_grd. s_edge (
[batch_size, seq_len, seq_len]
) holds scores of all possible dependent-head pairs. s_sib ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-sibling triples. s_cop ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-coparent triples. s_grd ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-grandparent triples.mask (BoolTensor) –
[batch_size, seq_len, seq_len]
. The mask to avoid aggregation on padding tokens.target (LongTensor) –
[batch_size, seq_len, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type
SemanticDependencyLBP#
- class supar.structs.vi.SemanticDependencyLBP(max_iter: int = 3)[source]#
Loopy Belief Propagation for approximately calculating marginals of semantic dependency trees Wang et al. (2019).
- forward(scores: List[torch.Tensor], mask: torch.BoolTensor, target: Optional[torch.LongTensor] = None) Tuple[torch.Tensor, torch.Tensor] [source]#
- Parameters
scores (Tensor, Tensor) – Tuple of four tensors s_edge, s_sib, s_cop and s_grd. s_edge (
[batch_size, seq_len, seq_len]
) holds scores of all possible dependent-head pairs. s_sib ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-sibling triples. s_cop ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-coparent triples. s_grd ([batch_size, seq_len, seq_len, seq_len]
) holds the scores of dependent-head-grandparent triples.mask (BoolTensor) –
[batch_size, seq_len, seq_len]
. The mask to avoid aggregation on padding tokens.target (LongTensor) –
[batch_size, seq_len, seq_len]
. A Tensor of gold-standard dependent-head pairs. Default:None
.
- Returns
The first is the training loss averaged by the number of tokens, which won’t be returned if
target=None
. The second is a tensor for marginals of shape[batch_size, seq_len, seq_len]
.- Return type