Affine Layers#
Biaffine#
- class supar.modules.affine.Biaffine(n_in: int, n_out: int = 1, n_proj: int | None = None, dropout: float | None = 0, scale: int = 0, bias_x: bool = True, bias_y: bool = True, decompose: bool = False, init: ~typing.Callable = <function zeros_>)[source]#
Biaffine layer for first-order scoring Dozat & Manning (2017).
This function has a tensor of weights \(W\) and bias terms if needed. The score \(s(x, y)\) of the vector pair \((x, y)\) is computed as \(x^T W y / d^s\), where d and s are vector dimension and scaling factor respectively. \(x\) and \(y\) can be concatenated with bias terms.
- Parameters:
n_in (int) – The size of the input feature.
n_out (int) – The number of output channels.
n_proj (Optional[int]) – If specified, applies MLP layers to reduce vector dimensions. Default:
None
.dropout (Optional[float]) – If specified, applies a
SharedDropout
layer with the ratio on MLP outputs. Default: 0.scale (float) – Factor to scale the scores. Default: 0.
bias_x (bool) – If
True
, adds a bias term for tensor \(x\). Default:True
.bias_y (bool) – If
True
, adds a bias term for tensor \(y\). Default:True
.decompose (bool) – If
True
, represents the weight as the product of 2 independent matrices. Default:False
.init (Callable) – Callable initialization method. Default: nn.init.zeros_.
- forward(x: Tensor, y: Tensor) Tensor [source]#
- Parameters:
x (torch.Tensor) –
[batch_size, seq_len, n_in]
.y (torch.Tensor) –
[batch_size, seq_len, n_in]
.
- Returns:
A scoring tensor of shape
[batch_size, n_out, seq_len, seq_len]
. Ifn_out=1
, the dimension forn_out
will be squeezed automatically.- Return type:
Triaffine#
- class supar.modules.affine.Triaffine(n_in: int, n_out: int = 1, n_proj: int | None = None, dropout: float | None = 0, scale: int = 0, bias_x: bool = False, bias_y: bool = False, decompose: bool = False, init: ~typing.Callable = <function zeros_>)[source]#
Triaffine layer for second-order scoring Zhang et al. 2020a,Wang et al. (2019).
This function has a tensor of weights \(W\) and bias terms if needed. The score \(s(x, y, z)\) of the vector triple \((x, y, z)\) is computed as \(x^T z^T W y / d^s\), where d and s are vector dimension and scaling factor respectively. \(x\) and \(y\) can be concatenated with bias terms.
- Parameters:
n_in (int) – The size of the input feature.
n_out (int) – The number of output channels.
n_proj (Optional[int]) – If specified, applies MLP layers to reduce vector dimensions. Default:
None
.dropout (Optional[float]) – If specified, applies a
SharedDropout
layer with the ratio on MLP outputs. Default: 0.scale (float) – Factor to scale the scores. Default: 0.
bias_x (bool) – If
True
, adds a bias term for tensor \(x\). Default:False
.bias_y (bool) – If
True
, adds a bias term for tensor \(y\). Default:False
.decompose (bool) – If
True
, represents the weight as the product of 3 independent matrices. Default:False
.init (Callable) – Callable initialization method. Default: nn.init.zeros_.
- forward(x: Tensor, y: Tensor, z: Tensor) Tensor [source]#
- Parameters:
x (torch.Tensor) –
[batch_size, seq_len, n_in]
.y (torch.Tensor) –
[batch_size, seq_len, n_in]
.z (torch.Tensor) –
[batch_size, seq_len, n_in]
.
- Returns:
A scoring tensor of shape
[batch_size, n_out, seq_len, seq_len, seq_len]
. Ifn_out=1
, the dimension forn_out
will be squeezed automatically.- Return type: