Funnel Transformer

Overview

The Funnel Transformer model was proposed in the paper Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. It is a bidirectional transformer model, like BERT, but with a pooling operation after each block of layers, a bit like in traditional convolutional neural networks (CNN) in computer vision.

The abstract from the paper is the following:

With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost. To improve the efficiency, we examine the much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only require a single-vector presentation of the sequence. With this intuition, we propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, we further improve the model capacity. In addition, to perform token-level predictions as required by common pretraining objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence via a decoder. Empirically, with comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks, including text classification, language understanding, and reading comprehension.

Tips:

Since Funnel Transformer uses pooling, the sequence length of the hidden states changes after each block of layers. This way, their length is divided by 2, which speeds up the computation of the next hidden states. The base model therefore has a final sequence length that is a quarter of the original one. This model can be used directly for tasks that just require a sentence summary (like sequence classification or multiple choice). For other tasks, the full model is used; this full model has a decoder that upsamples the final hidden states to the same sequence length as the input.
For tasks such as classification, this is not a problem, but for tasks like masked language modeling or token classification, we need a hidden state with the same sequence length as the original input. In those cases, the final hidden states are upsampled to the input sequence length and go through two additional layers. That’s why there are two versions of each checkpoint. The version suffixed with “-base” contains only the three blocks, while the version without that suffix contains the three blocks and the upsampling head with its additional layers.
The Funnel Transformer checkpoints are all available with a full version and a base version. The first ones should be used for FunnelModel, FunnelForPreTraining, FunnelForMaskedLM, FunnelForTokenClassification and FunnelForQuestionAnswering. The second ones should be used for FunnelBaseModel, FunnelForSequenceClassification and FunnelForMultipleChoice.

This model was contributed by sgugger. The original code can be found here.

hashtagFunnel Transformer

hashtagOverview

hashtagDocumentation resources

hashtagFunnelConfig

hashtagclass transformers.FunnelConfig

hashtagFunnelTokenizer

hashtagclass transformers.FunnelTokenizer

hashtagFunnelTokenizerFast

hashtagclass transformers.FunnelTokenizerFast

hashtagFunnel specific outputs

hashtagclass transformers.models.funnel.modeling_funnel.FunnelForPreTrainingOutput

hashtagclass transformers.models.funnel.modeling_tf_funnel.TFFunnelForPreTrainingOutput

hashtagFunnelBaseModel

hashtagclass transformers.FunnelBaseModel

hashtagFunnelModel

hashtagclass transformers.FunnelModel

hashtagFunnelModelForPreTraining

hashtagclass transformers.FunnelForPreTraining

hashtagFunnelForMaskedLM

hashtagclass transformers.FunnelForMaskedLM

hashtagFunnelForSequenceClassification

hashtagclass transformers.FunnelForSequenceClassification

hashtagFunnelForMultipleChoice

hashtagclass transformers.FunnelForMultipleChoice

hashtagFunnelForTokenClassification

hashtagclass transformers.FunnelForTokenClassification

hashtagFunnelForQuestionAnswering

hashtagclass transformers.FunnelForQuestionAnswering

hashtagTFFunnelBaseModel

hashtagclass transformers.TFFunnelBaseModel

hashtagTFFunnelModel

hashtagclass transformers.TFFunnelModel

hashtagTFFunnelModelForPreTraining

hashtagclass transformers.TFFunnelForPreTraining

hashtagTFFunnelForMaskedLM

hashtagclass transformers.TFFunnelForMaskedLM

hashtagTFFunnelForSequenceClassification

hashtagclass transformers.TFFunnelForSequenceClassification

hashtagTFFunnelForMultipleChoice

hashtagclass transformers.TFFunnelForMultipleChoice

hashtagTFFunnelForTokenClassification

hashtagclass transformers.TFFunnelForTokenClassification

hashtagTFFunnelForQuestionAnswering

hashtagclass transformers.TFFunnelForQuestionAnswering

Funnel Transformer

Overview

Documentation resources

FunnelConfig

class transformers.FunnelConfig

FunnelTokenizer

class transformers.FunnelTokenizer

FunnelTokenizerFast

class transformers.FunnelTokenizerFast

Funnel specific outputs

class transformers.models.funnel.modeling_funnel.FunnelForPreTrainingOutput

class transformers.models.funnel.modeling_tf_funnel.TFFunnelForPreTrainingOutput

FunnelBaseModel

class transformers.FunnelBaseModel

FunnelModel

class transformers.FunnelModel

FunnelModelForPreTraining

class transformers.FunnelForPreTraining

FunnelForMaskedLM

class transformers.FunnelForMaskedLM

FunnelForSequenceClassification

class transformers.FunnelForSequenceClassification

FunnelForMultipleChoice

class transformers.FunnelForMultipleChoice

FunnelForTokenClassification

class transformers.FunnelForTokenClassification

FunnelForQuestionAnswering

class transformers.FunnelForQuestionAnswering

TFFunnelBaseModel

class transformers.TFFunnelBaseModel

TFFunnelModel

class transformers.TFFunnelModel

TFFunnelModelForPreTraining

class transformers.TFFunnelForPreTraining

TFFunnelForMaskedLM

class transformers.TFFunnelForMaskedLM

TFFunnelForSequenceClassification

class transformers.TFFunnelForSequenceClassification

TFFunnelForMultipleChoice

class transformers.TFFunnelForMultipleChoice

TFFunnelForTokenClassification

class transformers.TFFunnelForTokenClassification

TFFunnelForQuestionAnswering

class transformers.TFFunnelForQuestionAnswering