All TGI CLI options

Text-generation-launcher arguments

Copied

Text Generation Launcher

Usage: text-generation-launcher [OPTIONS]

Options:

MODEL_ID

Copied

      --model-id <MODEL_ID>
          The name of the model to load. Can be a MODEL_ID as listed on <https://hf.co/models> like `gpt2` or `OpenAssistant/oasst-sft-1-pythia-12b`. Or it can be a local directory containing the necessary files as saved by `save_pretrained(...)` methods of transformers
          
          [env: MODEL_ID=]
          [default: bigscience/bloom-560m]

REVISION

Copied

      --revision <REVISION>
          The actual revision of the model if you're referring to a model on the hub. You can use a specific commit id or a branch like `refs/pr/2`
          
          [env: REVISION=]

VALIDATION_WORKERS

Copied

SHARDED

Copied

NUM_SHARD

Copied

QUANTIZE

Copied

DTYPE

Copied

TRUST_REMOTE_CODE

Copied

MAX_CONCURRENT_REQUESTS

Copied

MAX_BEST_OF

Copied

MAX_STOP_SEQUENCES

Copied

MAX_TOP_N_TOKENS

Copied

MAX_INPUT_LENGTH

Copied

MAX_TOTAL_TOKENS

Copied

WAITING_SERVED_RATIO

Copied

MAX_BATCH_PREFILL_TOKENS

Copied

MAX_BATCH_TOTAL_TOKENS

Copied

MAX_WAITING_TOKENS

Copied

HOSTNAME

Copied

PORT

Copied

SHARD_UDS_PATH

Copied

MASTER_ADDR

Copied

MASTER_PORT

Copied

HUGGINGFACE_HUB_CACHE

Copied

WEIGHTS_CACHE_OVERRIDE

Copied

DISABLE_CUSTOM_KERNELS

Copied

CUDA_MEMORY_FRACTION

Copied

ROPE_SCALING

Copied

ROPE_FACTOR

Copied

JSON_OUTPUT

Copied

OTLP_ENDPOINT

Copied

CORS_ALLOW_ORIGIN

Copied

WATERMARK_GAMMA

Copied

WATERMARK_DELTA

Copied

NGROK

Copied

NGROK_AUTHTOKEN

Copied

NGROK_EDGE

Copied

ENV

Copied

HELP

Copied

VERSION

Copied

Last updated