Skip to content

Inference

ARIS calls an LLM provider for chat:

  • Chat — the core container runs the chat agent, which calls the model for completions and tool selection.

Chat reads ARIS_INFERENCE_PROVIDER, ARIS_INFERENCE_BASE_URL, ARIS_INFERENCE_API_KEY, and ARIS_INFERENCE_CHAT_MODEL from .env.

Shared variables

Variable Required Default What it does
ARIS_INFERENCE_PROVIDER yes for chat unset, treated as mock Provider type — see Supported providers. When unset or mock, Chat is disabled.
ARIS_INFERENCE_BASE_URL provider-dependent https://api.openai.com/v1 for openai, https://api.anthropic.com/v1 for anthropic, none otherwise Base URL of the provider's API. For openai-compatible and litellm, set this to your gateway URL.
ARIS_INFERENCE_API_KEY yes for chat API key passed to the provider. Server-side only; never exposed to the browser.

Supported providers

Value Chat
openai yes
openai-compatible yes
litellm yes
anthropic via proxy
google via proxy
bedrock via proxy
vertex via proxy
mock disabled

The OpenAI-compatible values (openai, openai-compatible, litellm) work for chat. ARIS speaks the OpenAI Chat Completions wire shape and the gateway behind ARIS_INFERENCE_BASE_URL handles routing to the underlying model.

Using Google Gemini, Bedrock, or Vertex

ARIS does not call Google Gemini, Amazon Bedrock, or Google Vertex AI directly. To use models hosted on those services, run a LiteLLM proxy (or any OpenAI-compatible gateway), set ARIS_INFERENCE_PROVIDER=litellm, point ARIS_INFERENCE_BASE_URL at the proxy, and use the gateway's model identifier in ARIS_INFERENCE_CHAT_MODEL.

Where these variables run

The same .env is loaded into both the web and core containers, so the three variables above only need to be set once. Each container reads them at startup:

  • Core reads them when the chat handler initialises.

If a value changes, restart both containers so each one sees it.

See also