# Dropout Layers#

## IndependentDropout#

class supar.modules.dropout.IndependentDropout(p: float = 0.5)[source]#

For $$N$$ tensors, they use different dropout masks respectively. When $$N-M$$ of them are dropped, the remaining $$M$$ ones are scaled by a factor of $$N/M$$ to compensate, and when all of them are dropped together, zeros are returned.

Parameters

p (float) – The probability of an element to be zeroed. Default: 0.5.

Examples

>>> batch_size, seq_len, hidden_size = 1, 3, 5
>>> x, y = torch.ones(batch_size, seq_len, hidden_size), torch.ones(batch_size, seq_len, hidden_size)
>>> x, y = IndependentDropout()(x, y)
>>> x
tensor([[[1., 1., 1., 1., 1.],
[0., 0., 0., 0., 0.],
[2., 2., 2., 2., 2.]]])
>>> y
tensor([[[1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2.],
[0., 0., 0., 0., 0.]]])

forward(*items: ) [source]#
Parameters

items (List[Tensor]) – A list of tensors that have the same shape except the last dimension.

Returns

A tensors are of the same shape as items.

## SharedDropout#

class supar.modules.dropout.SharedDropout(p: float = 0.5, batch_first: bool = True)[source]#

SharedDropout differs from the vanilla dropout strategy in that the dropout mask is shared across one dimension.

Parameters
• p (float) – The probability of an element to be zeroed. Default: 0.5.

• batch_first (bool) – If True, the input and output tensors are provided as [batch_size, seq_len, *]. Default: True.

Examples

>>> batch_size, seq_len, hidden_size = 1, 3, 5
>>> x = torch.ones(batch_size, seq_len, hidden_size)
>>> nn.Dropout()(x)
tensor([[[0., 2., 2., 0., 0.],
[2., 2., 0., 2., 2.],
[2., 2., 2., 2., 0.]]])
>>> SharedDropout()(x)
tensor([[[2., 0., 2., 0., 2.],
[2., 0., 2., 0., 2.],
[2., 0., 2., 0., 2.]]])

forward(x: torch.Tensor) [source]#
Parameters

x (Tensor) – A tensor of any shape.

Returns

A tensor with the same shape as x.