Encode Inputs
Last updated
Last updated
PythonRustNode
These types represent all the different kinds of input that a accepts when using encode_batch()
.
tokenizers.TextEncodeInput
Represents a textual input for encoding. Can be either:
A single sequence:
A pair of sequences:
A Tuple of
Or a List of of size 2
alias of Union[str, Tuple[str, str], List[str]]
.
tokenizers.PreTokenizedEncodeInput
Represents a pre-tokenized input for encoding. Can be either:
A single sequence:
A pair of sequences:
A Tuple of
Or a List of of size 2
alias of Union[List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]]
.
tokenizers.EncodeInput
Represents all the possible types of input for encoding. Can be:
alias of Union[str, Tuple[str, str], List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]]
.
When is_pretokenized=False
:
When is_pretokenized=True
: