Dropout Layers#

IndependentDropout#

class supar.modules.dropout.IndependentDropout(p: float = 0.5)[source]#

For \(N\) tensors, they use different dropout masks respectively. When \(N-M\) of them are dropped, the remaining \(M\) ones are scaled by a factor of \(N/M\) to compensate, and when all of them are dropped together, zeros are returned.

Parameters:

p (float) – The probability of an element to be zeroed. Default: 0.5.

Examples

>>> batch_size, seq_len, hidden_size = 1, 3, 5
>>> x, y = torch.ones(batch_size, seq_len, hidden_size), torch.ones(batch_size, seq_len, hidden_size)
>>> x, y = IndependentDropout()(x, y)
>>> x
tensor([[[1., 1., 1., 1., 1.],
         [0., 0., 0., 0., 0.],
         [2., 2., 2., 2., 2.]]])
>>> y
tensor([[[1., 1., 1., 1., 1.],
         [2., 2., 2., 2., 2.],
         [0., 0., 0., 0., 0.]]])
forward(*items: List[Tensor]) List[Tensor][source]#
Parameters:

items (List[Tensor]) – A list of tensors that have the same shape except the last dimension.

Returns:

A tensors are of the same shape as items.

SharedDropout#

class supar.modules.dropout.SharedDropout(p: float = 0.5, batch_first: bool = True)[source]#

SharedDropout differs from the vanilla dropout strategy in that the dropout mask is shared across one dimension.

Parameters:
  • p (float) – The probability of an element to be zeroed. Default: 0.5.

  • batch_first (bool) – If True, the input and output tensors are provided as [batch_size, seq_len, *]. Default: True.

Examples

>>> batch_size, seq_len, hidden_size = 1, 3, 5
>>> x = torch.ones(batch_size, seq_len, hidden_size)
>>> nn.Dropout()(x)
tensor([[[0., 2., 2., 0., 0.],
         [2., 2., 0., 2., 2.],
         [2., 2., 2., 2., 0.]]])
>>> SharedDropout()(x)
tensor([[[2., 0., 2., 0., 2.],
         [2., 0., 2., 0., 2.],
         [2., 0., 2., 0., 2.]]])
forward(x: Tensor) Tensor[source]#
Parameters:

x (Tensor) – A tensor of any shape.

Returns:

A tensor with the same shape as x.