Vocab#
Vocab#
- class supar.utils.vocab.Vocab(counter: Counter, min_freq: int = 1, specials: Tuple = (), unk_index: int = 0)[source]#
Defines a vocabulary object that will be used to numericalize a field.
- Parameters:
counter (Counter) –
Counter
object holding the frequencies of each value found in the data.min_freq (int) – The minimum frequency needed to include a token in the vocabulary. Default: 1.
specials (Tuple[str]) – The list of special tokens (e.g., pad, unk, bos and eos) that will be prepended to the vocabulary. Default:
[]
.unk_index (int) – The index of unk token. Default: 0.
- itos#
A list of token strings indexed by their numerical identifiers.
- stoi#
A
defaultdict
object mapping token strings to numerical identifiers.