Model Summaries
Model Summaries
The model architectures included come from a wide variety of sources. Sources, including papers, original impl (“reference code”) that I rewrote / adapted, and PyTorch impl that I leveraged directly (“code”) are listed below.
Most included models have pretrained weights. The weights are either:
from their original sources
ported by myself from their original impl in a different framework (e.g. Tensorflow models)
trained from scratch using the included training script
The validation results for the pretrained weights are here
A more exciting view (with pretty pictures) of the models within timm
can be found at paperswithcode.
Big Transfer ResNetV2 (BiT)
Implementation: resnetv2.py
Paper:
Big Transfer (BiT): General Visual Representation Learning
- https://arxiv.org/abs/1912.11370Reference code: https://github.com/google-research/big_transfer
Cross-Stage Partial Networks
Implementation: cspnet.py
Paper:
CSPNet: A New Backbone that can Enhance Learning Capability of CNN
- https://arxiv.org/abs/1911.11929Reference impl: https://github.com/WongKinYiu/CrossStagePartialNetworks
DenseNet
Implementation: densenet.py
Paper:
Densely Connected Convolutional Networks
- https://arxiv.org/abs/1608.06993
DLA
Implementation: dla.py
Dual-Path Networks
Implementation: dpn.py
Paper:
Dual Path Networks
- https://arxiv.org/abs/1707.01629My PyTorch code: https://github.com/rwightman/pytorch-dpn-pretrained
Reference code: https://github.com/cypw/DPNs
GPU-Efficient Networks
Implementation: byobnet.py
Paper:
Neural Architecture Design for GPU-Efficient Networks
- https://arxiv.org/abs/2006.14090Reference code: https://github.com/idstcv/GPU-Efficient-Networks
HRNet
Implementation: hrnet.py
Paper:
Deep High-Resolution Representation Learning for Visual Recognition
- https://arxiv.org/abs/1908.07919
Inception-V3
Implementation: inception_v3.py
Paper:
Rethinking the Inception Architecture for Computer Vision
- https://arxiv.org/abs/1512.00567
Inception-V4
Implementation: inception_v4.py
Paper:
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
- https://arxiv.org/abs/1602.07261
Inception-ResNet-V2
Implementation: inception_resnet_v2.py
Paper:
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
- https://arxiv.org/abs/1602.07261
NASNet-A
Implementation: nasnet.py
Papers:
Learning Transferable Architectures for Scalable Image Recognition
- https://arxiv.org/abs/1707.07012
PNasNet-5
Implementation: pnasnet.py
Papers:
Progressive Neural Architecture Search
- https://arxiv.org/abs/1712.00559
EfficientNet
Implementation: efficientnet.py
Papers:
EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252
EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665
EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946
EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html
MixNet - https://arxiv.org/abs/1907.09595
MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626
MobileNet-V2 - https://arxiv.org/abs/1801.04381
FBNet-C - https://arxiv.org/abs/1812.03443
Single-Path NAS - https://arxiv.org/abs/1904.02877
My PyTorch code: https://github.com/rwightman/gen-efficientnet-pytorch
MobileNet-V3
Implementation: mobilenetv3.py
Paper:
Searching for MobileNetV3
- https://arxiv.org/abs/1905.02244
RegNet
Implementation: regnet.py
Paper:
Designing Network Design Spaces
- https://arxiv.org/abs/2003.13678
RepVGG
Implementation: byobnet.py
Paper:
Making VGG-style ConvNets Great Again
- https://arxiv.org/abs/2101.03697Reference code: https://github.com/DingXiaoH/RepVGG
ResNet, ResNeXt
Implementation: resnet.py
ResNet (V1B)
Paper:
Deep Residual Learning for Image Recognition
- https://arxiv.org/abs/1512.03385
ResNeXt
Paper:
Aggregated Residual Transformations for Deep Neural Networks
- https://arxiv.org/abs/1611.05431
‘Bag of Tricks’ / Gluon C, D, E, S ResNet variants
Paper:
Bag of Tricks for Image Classification with CNNs
- https://arxiv.org/abs/1812.01187
Instagram pretrained / ImageNet tuned ResNeXt101
Paper:
Exploring the Limits of Weakly Supervised Pretraining
- https://arxiv.org/abs/1805.00932Weights: https://pytorch.org/hub/facebookresearch_WSL-Images_resnext (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
Semi-supervised (SSL) / Semi-weakly Supervised (SWSL) ResNet and ResNeXts
Paper:
Billion-scale semi-supervised learning for image classification
- https://arxiv.org/abs/1905.00546Weights: https://github.com/facebookresearch/semi-supervised-ImageNet1K-models (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
Squeeze-and-Excitation Networks
Paper:
Squeeze-and-Excitation Networks
- https://arxiv.org/abs/1709.01507Code: Added to ResNet base, this is current version going forward, old
senet.py
is being deprecated
ECAResNet (ECA-Net)
Paper:
ECA-Net: Efficient Channel Attention for Deep CNN
- https://arxiv.org/abs/1910.03151v4Code: Added to ResNet base, ECA module contributed by @VRandme, reference https://github.com/BangguWu/ECANet
Res2Net
Implementation: res2net.py
Paper:
Res2Net: A New Multi-scale Backbone Architecture
- https://arxiv.org/abs/1904.01169
ResNeSt
Implementation: resnest.py
Paper:
ResNeSt: Split-Attention Networks
- https://arxiv.org/abs/2004.08955
ReXNet
Implementation: rexnet.py
Paper:
ReXNet: Diminishing Representational Bottleneck on CNN
- https://arxiv.org/abs/2007.00992
Selective-Kernel Networks
Implementation: sknet.py
Paper:
Selective-Kernel Networks
- https://arxiv.org/abs/1903.06586
SelecSLS
Implementation: selecsls.py
Paper:
XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera
- https://arxiv.org/abs/1907.00837
Squeeze-and-Excitation Networks
Implementation: senet.py NOTE: I am deprecating this version of the networks, the new ones are part of
resnet.py
Paper:
Squeeze-and-Excitation Networks
- https://arxiv.org/abs/1709.01507
TResNet
Implementation: tresnet.py
Paper:
TResNet: High Performance GPU-Dedicated Architecture
- https://arxiv.org/abs/2003.13630
VGG
Implementation: vgg.py
Paper:
Very Deep Convolutional Networks For Large-Scale Image Recognition
- https://arxiv.org/pdf/1409.1556.pdf
Vision Transformer
Implementation: vision_transformer.py
Paper:
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- https://arxiv.org/abs/2010.11929Reference code and pretrained weights: https://github.com/google-research/vision_transformer
VovNet V2 and V1
Implementation: vovnet.py
Paper:
CenterMask : Real-Time Anchor-Free Instance Segmentation
- https://arxiv.org/abs/1911.06667Reference code: https://github.com/youngwanLEE/vovnet-detectron2
Xception
Implementation: xception.py
Paper:
Xception: Deep Learning with Depthwise Separable Convolutions
- https://arxiv.org/abs/1610.02357
Xception (Modified Aligned, Gluon)
Implementation: gluon_xception.py
Paper:
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- https://arxiv.org/abs/1802.02611
Xception (Modified Aligned, TF)
Implementation: aligned_xception.py
Paper:
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- https://arxiv.org/abs/1802.02611
Last updated