All TGI CLI options
Text-generation-launcher arguments
Text Generation Launcher
Usage: text-generation-launcher [OPTIONS]
Options:MODEL_ID
--model-id <MODEL_ID>
The name of the model to load. Can be a MODEL_ID as listed on <https://hf.co/models> like `gpt2` or `OpenAssistant/oasst-sft-1-pythia-12b`. Or it can be a local directory containing the necessary files as saved by `save_pretrained(...)` methods of transformers
[env: MODEL_ID=]
[default: bigscience/bloom-560m]REVISION
--revision <REVISION>
The actual revision of the model if you're referring to a model on the hub. You can use a specific commit id or a branch like `refs/pr/2`
[env: REVISION=]VALIDATION_WORKERS
SHARDED
NUM_SHARD
QUANTIZE
DTYPE
TRUST_REMOTE_CODE
MAX_CONCURRENT_REQUESTS
MAX_BEST_OF
MAX_STOP_SEQUENCES
MAX_TOP_N_TOKENS
MAX_INPUT_LENGTH
MAX_TOTAL_TOKENS
WAITING_SERVED_RATIO
MAX_BATCH_PREFILL_TOKENS
MAX_BATCH_TOTAL_TOKENS
MAX_WAITING_TOKENS
HOSTNAME
PORT
SHARD_UDS_PATH
MASTER_ADDR
MASTER_PORT
HUGGINGFACE_HUB_CACHE
WEIGHTS_CACHE_OVERRIDE
DISABLE_CUSTOM_KERNELS
CUDA_MEMORY_FRACTION
ROPE_SCALING
ROPE_FACTOR
JSON_OUTPUT
OTLP_ENDPOINT
CORS_ALLOW_ORIGIN
WATERMARK_GAMMA
WATERMARK_DELTA
NGROK
NGROK_AUTHTOKEN
NGROK_EDGE
ENV
HELP
VERSION
Last updated