All TGI CLI options
Text-generation-launcher arguments
Copied
Text Generation Launcher
Usage: text-generation-launcher [OPTIONS]
Options:MODEL_ID
Copied
--model-id <MODEL_ID>
The name of the model to load. Can be a MODEL_ID as listed on <https://hf.co/models> like `gpt2` or `OpenAssistant/oasst-sft-1-pythia-12b`. Or it can be a local directory containing the necessary files as saved by `save_pretrained(...)` methods of transformers
[env: MODEL_ID=]
[default: bigscience/bloom-560m]REVISION
Copied
--revision <REVISION>
The actual revision of the model if you're referring to a model on the hub. You can use a specific commit id or a branch like `refs/pr/2`
[env: REVISION=]VALIDATION_WORKERS
Copied
SHARDED
Copied
NUM_SHARD
Copied
QUANTIZE
Copied
DTYPE
Copied
TRUST_REMOTE_CODE
Copied
MAX_CONCURRENT_REQUESTS
Copied
MAX_BEST_OF
Copied
MAX_STOP_SEQUENCES
Copied
MAX_TOP_N_TOKENS
Copied
MAX_INPUT_LENGTH
Copied
MAX_TOTAL_TOKENS
Copied
WAITING_SERVED_RATIO
Copied
MAX_BATCH_PREFILL_TOKENS
Copied
MAX_BATCH_TOTAL_TOKENS
Copied
MAX_WAITING_TOKENS
Copied
HOSTNAME
Copied
PORT
Copied
SHARD_UDS_PATH
Copied
MASTER_ADDR
Copied
MASTER_PORT
Copied
HUGGINGFACE_HUB_CACHE
Copied
WEIGHTS_CACHE_OVERRIDE
Copied
DISABLE_CUSTOM_KERNELS
Copied
CUDA_MEMORY_FRACTION
Copied
ROPE_SCALING
Copied
ROPE_FACTOR
Copied
JSON_OUTPUT
Copied
OTLP_ENDPOINT
Copied
CORS_ALLOW_ORIGIN
Copied
WATERMARK_GAMMA
Copied
WATERMARK_DELTA
Copied
NGROK
Copied
NGROK_AUTHTOKEN
Copied
NGROK_EDGE
Copied
ENV
Copied
HELP
Copied
VERSION
Copied
Last updated