OpenClaw ships a bundledDocumentation Index
Fetch the complete documentation index at: https://docs2.openclaw.ai/llms.txt
Use this file to discover all available pages before exploring further.
xai provider plugin for Grok models. For most
users, the recommended path is Grok OAuth with an eligible SuperGrok or X Premium
subscription. OpenClaw stays local-first: the Gateway, config, routing, and
tools run on your machine, while Grok model requests authenticate through xAI
and are sent to xAI’s API.
OAuth does not require an xAI API key, and it does not require the Grok Build
app. xAI may still show Grok Build on the consent screen because OpenClaw uses
xAI’s shared OAuth client.
Choose your setup path
Use the path that matches your OpenClaw install state:New OpenClaw install
Run onboarding with daemon install when you are setting up a new local
Gateway, then choose the xAI/Grok OAuth option in the model/auth step:On a VPS or over SSH, use device-code during onboarding:OAuth does not require an xAI API key. OpenClaw does not require the Grok
Build app. xAI may still label the consent app as Grok Build because
OpenClaw uses xAI’s shared OAuth client.
Existing OpenClaw install
If OpenClaw is already configured, sign in to xAI only. Do not rerun full
onboarding or reinstall the daemon just to connect Grok:Use the device-code flow instead when the Gateway runs over SSH, Docker, or
a VPS and a localhost browser callback is awkward:To make Grok the default model after signing in, apply it separately:Rerun full onboarding only if you intentionally want to change Gateway,
daemon, channel, workspace, or other setup choices.
API-key path
API-key setup still works for xAI Console keys and for media surfaces that
require key-backed provider config:
OpenClaw uses the xAI Responses API as the bundled xAI transport. The same
credential from
openclaw models auth login --provider xai --method oauth,
openclaw models auth login --provider xai --device-code, or
openclaw models auth login --provider xai --method api-key can also power first-class
web_search, x_search, remote code_execution, and xAI image/video generation.
Speech and transcription currently require XAI_API_KEY or provider config.
Grok-backed web_search prefers xAI OAuth and falls back to XAI_API_KEY or
plugin web-search config.
If you store an xAI key under plugins.entries.xai.config.webSearch.apiKey,
the bundled xAI model provider reuses that key as a fallback too.
Set plugins.entries.xai.config.webSearch.baseUrl to route Grok web_search
and, by default, x_search through an operator xAI Responses proxy.
code_execution tuning lives under plugins.entries.xai.config.codeExecution.OAuth troubleshooting
-
If browser OAuth cannot reach
127.0.0.1:56121, useopenclaw models auth login --provider xai --device-code. -
If sign-in succeeds but Grok is not the default model, run
openclaw models set xai/grok-4.3. -
To inspect saved xAI auth profiles, run:
- xAI decides which accounts can receive OAuth API tokens. If an account is not eligible, try the API-key path or check the subscription on xAI’s side.
Built-in catalog
OpenClaw includes the current xAI chat models out of the box, ordered newest first in model pickers:| Family | Model ids |
|---|---|
| Grok Build 0.1 | grok-build-0.1 |
| Grok 4.3 | grok-4.3 |
| Grok 4.20 Beta | grok-4.20-beta-latest-reasoning, grok-4.20-beta-latest-non-reasoning |
grok-build-0.1; OpenClaw no longer shows the other retired
upstream slugs in the selectable catalog.
OpenClaw feature coverage
The bundled plugin maps xAI’s current public API surface onto OpenClaw’s shared provider and tool contracts. Capabilities that don’t fit the shared contract (for example streaming TTS and realtime voice) are not exposed - see the table below.| xAI capability | OpenClaw surface | Status |
|---|---|---|
| Chat / Responses | xai/<model> model provider | Yes |
| Server-side web search | web_search provider grok | Yes |
| Server-side X search | x_search tool | Yes |
| Server-side code execution | code_execution tool | Yes |
| Images | image_generate | Yes |
| Videos | video_generate | Yes |
| Batch text-to-speech | messages.tts.provider: "xai" / tts | Yes |
| Streaming TTS | - | Not exposed; OpenClaw’s TTS contract returns complete audio buffers |
| Batch speech-to-text | tools.media.audio / media understanding | Yes |
| Streaming speech-to-text | Voice Call streaming.provider: "xai" | Yes |
| Realtime voice | - | Not exposed yet; different session/WebSocket contract |
| Files / batches | Generic model API compatibility only | Not a first-class OpenClaw tool |
OpenClaw uses xAI’s REST image/video/TTS/STT APIs for media generation,
speech, and batch transcription, xAI’s streaming STT WebSocket for live
voice-call transcription, and the Responses API for model, search, and
code-execution tools. Features that need different OpenClaw contracts, such as
Realtime voice sessions, are documented here as upstream capabilities rather
than hidden plugin behavior.
Fast-mode mappings
/fast on or agents.defaults.models["xai/<model>"].params.fastMode: true
rewrites native xAI requests as follows:
| Source model | Fast-mode target |
|---|---|
grok-3 | grok-3-fast |
grok-3-mini | grok-3-mini-fast |
grok-4 | grok-4-fast |
grok-4-0709 | grok-4-fast |
Legacy compatibility aliases
Legacy aliases still normalize to the canonical bundled ids:| Legacy alias | Canonical id |
|---|---|
grok-code-fast-1 | grok-build-0.1 |
grok-code-fast | grok-build-0.1 |
grok-code-fast-1-0825 | grok-build-0.1 |
grok-4-fast-reasoning | grok-4-fast |
grok-4-1-fast-reasoning | grok-4-1-fast |
grok-4.20-reasoning | grok-4.20-beta-latest-reasoning |
grok-4.20-non-reasoning | grok-4.20-beta-latest-non-reasoning |
Features
Web search
Web search
The bundled
grok web-search provider prefers xAI OAuth, then falls back
to XAI_API_KEY or a plugin web-search key:Video generation
Video generation
The bundled
xai plugin registers video generation through the shared
video_generate tool.- Default video model:
xai/grok-imagine-video - Modes: text-to-video, image-to-video, reference-image generation, remote video edit, and remote video extension
- Aspect ratios:
1:1,16:9,9:16,4:3,3:4,3:2,2:3 - Resolutions:
480P,720P - Duration: 1-15 seconds for generation/image-to-video, 1-10 seconds when
using
reference_imageroles, 2-10 seconds for extension - Reference-image generation: set
imageRolestoreference_imagefor every supplied image; xAI accepts up to 7 such images - Default operation timeout: 600 seconds unless
video_generate.timeoutMsoragents.defaults.videoGenerationModel.timeoutMsis set
See Video Generation for shared tool parameters,
provider selection, and failover behavior.
Image generation
Image generation
The bundled
xai plugin registers image generation through the shared
image_generate tool.- Default image model:
xai/grok-imagine-image - Additional model:
xai/grok-imagine-image-quality - Modes: text-to-image and reference-image edit
- Reference inputs: one
imageor up to fiveimages - Aspect ratios:
1:1,16:9,9:16,4:3,3:4,2:3,3:2 - Resolutions:
1K,2K - Count: up to 4 images
- Default operation timeout: 600 seconds unless
image_generate.timeoutMsoragents.defaults.imageGenerationModel.timeoutMsis set
b64_json image responses so generated media can be
stored and delivered through the normal channel attachment path. Local
reference images are converted to data URLs; remote http(s) references are
passed through.To use xAI as the default image provider:xAI also documents
quality, mask, user, and additional native ratios
such as 1:2, 2:1, 9:20, and 20:9. OpenClaw forwards only the
shared cross-provider image controls today; unsupported native-only knobs
are intentionally not exposed through image_generate.Text-to-speech
Text-to-speech
The bundled
xai plugin registers text-to-speech through the shared tts
provider surface.- Voices:
eve,ara,rex,sal,leo,una - Default voice:
eve - Formats:
mp3,wav,pcm,mulaw,alaw - Language: BCP-47 code or
auto - Speed: provider-native speed override
- Native Opus voice-note format is not supported
OpenClaw uses xAI’s batch
/v1/tts endpoint. xAI also offers streaming TTS
over WebSocket, but the OpenClaw speech provider contract currently expects
a complete audio buffer before reply delivery.Speech-to-text
Speech-to-text
The bundled Language can be supplied through the shared audio media config or per-call
transcription request. Prompt hints are accepted by the shared OpenClaw
surface, but the xAI REST STT integration only forwards file, model, and
language because those map cleanly to the current public xAI endpoint.
xai plugin registers batch speech-to-text through OpenClaw’s
media-understanding transcription surface.- Default model:
grok-stt - Endpoint: xAI REST
/v1/stt - Input path: multipart audio file upload
- Supported by OpenClaw wherever inbound audio transcription uses
tools.media.audio, including Discord voice-channel segments and channel audio attachments
Streaming speech-to-text
Streaming speech-to-text
The bundled Provider-owned config lives under
xai plugin also registers a realtime transcription provider
for live voice-call audio.- Endpoint: xAI WebSocket
wss://api.x.ai/v1/stt - Default encoding:
mulaw - Default sample rate:
8000 - Default endpointing:
800ms - Interim transcripts: enabled by default
plugins.entries.voice-call.config.streaming.providers.xai. Supported
keys are apiKey, baseUrl, sampleRate, encoding (pcm, mulaw, or
alaw), interimResults, endpointingMs, and language.This streaming provider is for Voice Call’s realtime transcription path.
Discord voice currently records short segments and uses the batch
tools.media.audio transcription path instead.x_search configuration
x_search configuration
The bundled xAI plugin exposes
x_search as an OpenClaw tool for searching
X (formerly Twitter) content via Grok.Config path: plugins.entries.xai.config.xSearch| Key | Type | Default | Description |
|---|---|---|---|
enabled | boolean | - | Enable or disable x_search |
model | string | grok-4-1-fast | Model used for x_search requests |
baseUrl | string | - | xAI Responses base URL override |
inlineCitations | boolean | - | Include inline citations in results |
maxTurns | number | - | Maximum conversation turns |
timeoutSeconds | number | - | Request timeout in seconds |
cacheTtlMinutes | number | - | Cache time-to-live in minutes |
Code execution configuration
Code execution configuration
The bundled xAI plugin exposes
code_execution as an OpenClaw tool for
remote code execution in xAI’s sandbox environment.Config path: plugins.entries.xai.config.codeExecution| Key | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true (if key available) | Enable or disable code execution |
model | string | grok-4-1-fast | Model used for code execution requests |
maxTurns | number | - | Maximum conversation turns |
timeoutSeconds | number | - | Request timeout in seconds |
This is remote xAI sandbox execution, not local
exec.Known limits
Known limits
- xAI auth can use an API key, environment variable, plugin config fallback,
browser OAuth, or device-code OAuth with an eligible xAI account. Browser
OAuth uses a local callback on
127.0.0.1:56121; for remote hosts, usexai-device-codeunless you want to forward that port before opening the sign-in URL. xAI decides which accounts can receive OAuth API tokens, and the consent page may show Grok Build even though OpenClaw does not require the Grok Build app. grok-4.20-multi-agent-experimental-beta-0304is not supported on the normal xAI provider path because it requires a different upstream API surface than the standard OpenClaw xAI transport.- xAI Realtime voice is not registered as an OpenClaw provider yet. It needs a different bidirectional voice session contract than batch STT or streaming transcription.
- xAI image
quality, imagemask, and extra native-only aspect ratios are not exposed until the sharedimage_generatetool has corresponding cross-provider controls.
Advanced notes
Advanced notes
- OpenClaw applies xAI-specific tool-schema and tool-call compatibility fixes automatically on the shared runner path.
- Native xAI requests default
tool_stream: true. Setagents.defaults.models["xai/<model>"].params.tool_streamtofalseto disable it. - The bundled xAI wrapper strips unsupported strict tool-schema flags and reasoning payload keys before sending native xAI requests.
web_search,x_search, andcode_executionare exposed as OpenClaw tools. OpenClaw enables the specific xAI built-in it needs inside each tool request instead of attaching all native tools to every chat turn.- Grok
web_searchreadsplugins.entries.xai.config.webSearch.baseUrl.x_searchreadsplugins.entries.xai.config.xSearch.baseUrl, then falls back to the Grok web-search base URL. x_searchandcode_executionare owned by the bundled xAI plugin rather than hardcoded into the core model runtime.code_executionis remote xAI sandbox execution, not localexec.
Live testing
The xAI media paths are covered by unit tests and opt-in live suites. ExportXAI_API_KEY in the process environment before running live probes.
Related
Model selection
Choosing providers, model refs, and failover behavior.
Video generation
Shared video tool parameters and provider selection.
All providers
The broader provider overview.
Troubleshooting
Common issues and fixes.