NVIDIA
NVIDIA provides an OpenAI-compatible API athttps://integrate.api.nvidia.com/v1 for Nemotron and NeMo models. Authenticate with an API key from NVIDIA NGC.
CLI setup
Export the key once, then run onboarding and set an NVIDIA model:--token, remember it lands in shell history and ps output; prefer the env var when possible.
Config snippet
Model IDs
| Model ref | Name | Context | Max output |
|---|---|---|---|
nvidia/nvidia/llama-3.1-nemotron-70b-instruct | NVIDIA Llama 3.1 Nemotron 70B Instruct | 131,072 | 4,096 |
nvidia/meta/llama-3.3-70b-instruct | Meta Llama 3.3 70B Instruct | 131,072 | 4,096 |
nvidia/nvidia/mistral-nemo-minitron-8b-8k-instruct | NVIDIA Mistral NeMo Minitron 8B Instruct | 8,192 | 2,048 |
Notes
- OpenAI-compatible
/v1endpoint; use an API key from NVIDIA NGC. - Provider auto-enables when
NVIDIA_API_KEYis set. - The bundled catalog is static; costs default to
0in source.