Skip to main content

AI Chat Models

MachinaOs supports six AI providers for chat completions, with models fetched dynamically from each provider’s API.

Available Providers

ProviderModelsBest For
OpenAIGPT-4o, GPT-4 Turbo, o1, o3, o4-miniGeneral purpose, reasoning
AnthropicClaude 3.5 Sonnet, Claude 3 Opus, Claude 3 HaikuCoding, analysis, extended thinking
GoogleGemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash ThinkingMultimodal, long context
OpenRouter200+ modelsAccess multiple providers via single API
GroqLlama, Mixtral, QwenUltra-fast inference
CerebrasLlama, QwenUltra-fast on custom hardware

Adding API Keys

  1. Click the key icon in the toolbar
  2. Select the provider
  3. Enter your API key
  4. Click Validate to test
API keys are encrypted and stored locally. They’re never sent to MachinaOs servers.

OpenAI Chat Model

Models

ModelBest For
gpt-4oMost capable, multimodal
gpt-4-turboFast, cost-effective GPT-4
o1Complex reasoning tasks
o3Advanced reasoning
o4-miniFast, efficient reasoning

Parameters

model
select
required
The model to use
prompt
string
required
The message to send. Supports template variables.
temperature
slider
default:"0.7"
Randomness (0 = deterministic, 1 = creative)
maxTokens
number
default:"1000"
Maximum response length
responseFormat
select
default:"text"
Output format: text or json_object
reasoningEffort
select
default:"medium"
For o-series models: minimal, low, medium, or high reasoning effort

Output

{
  "response": "The AI's response text",
  "model": "gpt-4o",
  "thinking": "Reasoning process (o-series only)",
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 100,
    "total_tokens": 150
  }
}

Anthropic Claude Model

Models

ModelBest For
claude-3-5-sonnet-20241022Best for coding and complex tasks
claude-3-opus-20240229Most capable, detailed analysis
claude-3-haiku-20240307Fast responses, simple tasks

Parameters

model
select
required
Claude model to use
prompt
string
required
The message to send
systemPrompt
string
System instructions for the model
temperature
slider
default:"0.7"
Randomness (0-1)
maxTokens
number
default:"1000"
Maximum response length
thinkingEnabled
boolean
default:"false"
Enable extended thinking mode (Claude 3.5 Sonnet, Claude 3 Opus)
thinkingBudget
number
default:"2048"
Token budget for thinking (1024-16000). Shown when thinkingEnabled is true.

Extended Thinking

Claude’s extended thinking mode shows the model’s reasoning process:
{
  "response": "Claude's final response",
  "thinking": "Let me analyze this step by step...",
  "model": "claude-3-5-sonnet-20241022",
  "stop_reason": "end_turn"
}
When thinking is enabled, max_tokens must be greater than thinkingBudget. Temperature is automatically set to 1.

Google Gemini Model

Models

ModelBest For
gemini-2.5-proMost intelligent, complex tasks
gemini-2.5-flashFast, frontier performance
gemini-2.0-flash-thinkingReasoning with thinking output

Parameters

model
select
required
Gemini model to use
prompt
string
required
The message to send
temperature
slider
default:"0.7"
Randomness (0-1)
maxTokens
number
default:"1000"
Maximum response length
safetySettings
select
default:"default"
Content safety level
thinkingEnabled
boolean
default:"false"
Enable thinking mode (Gemini 2.5 models, Flash Thinking)

Output

{
  "response": "Gemini's response",
  "thinking": "Reasoning process (when enabled)",
  "model": "gemini-2.5-pro"
}

OpenRouter Model

OpenRouter provides access to 200+ models from multiple providers through a single API.

Features

  • Unified API: One API key for OpenAI, Anthropic, Google, Meta, Mistral, and more
  • Free Models: Some models available at no cost (marked with [FREE] prefix)
  • Fallback: Automatic model fallback if primary is unavailable

Models

Models are grouped by cost in the dropdown:
  • Free models: [FREE] prefix, no cost
  • Paid models: Standard pricing per provider
Popular models include:
  • openai/gpt-4o
  • anthropic/claude-3.5-sonnet
  • google/gemini-2.5-pro
  • meta-llama/llama-3.1-405b-instruct
  • mistralai/mixtral-8x22b-instruct

Parameters

model
select
required
Model in format: provider/model-name
prompt
string
required
The message to send
temperature
slider
default:"0.7"
Randomness (0-1)
maxTokens
number
default:"1000"
Maximum response length

Output

{
  "response": "Model's response",
  "model": "openai/gpt-4o",
  "provider": "openrouter"
}

Groq Model

Groq provides ultra-fast inference on custom LPU (Language Processing Unit) hardware.

Models

ModelBest For
llama-3.1-70b-versatileGeneral purpose, fast
llama-3.1-8b-instantUltra-fast, simple tasks
mixtral-8x7b-32768Long context, reasoning
qwen3-32bReasoning with parsed output
qwq-32bAdvanced reasoning

Parameters

model
select
required
Groq model to use
prompt
string
required
The message to send
temperature
slider
default:"0.7"
Randomness (0-1)
maxTokens
number
default:"1000"
Maximum response length
reasoningFormat
select
default:"parsed"
For Qwen3/QwQ models: “parsed” returns reasoning, “hidden” returns only final answer

Reasoning Output

Qwen3 and QwQ models support reasoning output:
{
  "response": "The final answer",
  "thinking": "Step-by-step reasoning process",
  "model": "qwen3-32b"
}

Cerebras Model

Cerebras provides ultra-fast inference on custom wafer-scale AI hardware.

Models

ModelBest For
llama3.1-8bFast, efficient
llama3.1-70bCapable, balanced
qwen-2.5-32bReasoning tasks

Parameters

model
select
required
Cerebras model to use
prompt
string
required
The message to send
temperature
slider
default:"0.7"
Randomness (0-1)
maxTokens
number
default:"1000"
Maximum response length

Output

{
  "response": "Cerebras model response",
  "model": "llama3.1-70b"
}

Thinking/Reasoning Modes

Several providers support extended thinking or reasoning modes that show the model’s internal reasoning process.
ProviderModelsParameter
Claude3.5 Sonnet, 3 OpusthinkingBudget (tokens)
Gemini2.5 Pro/Flash, Flash ThinkingthinkingBudget (tokens)
OpenAIo1, o3, o4 seriesreasoningEffort (level)
GroqQwen3, QwQreasoningFormat (parsed/hidden)

Using Thinking Output

The thinking field is available in the node output for downstream nodes:
{{openaiChatModel.thinking}}
{{anthropicChatModel.thinking}}

Comparing Providers

FeatureOpenAIClaudeGeminiOpenRouterGroqCerebras
SpeedFastMediumFastVariesUltra-fastUltra-fast
Reasoningo-seriesExtended thinkingThinking modeModel-dependentQwen3/QwQ-
Context Window128K200K1M+Varies32K-128K128K
MultimodalYesYesYesModel-dependentNoNo
JSON ModeYesNoNoModel-dependentNoNo

Common Use Cases

Text Generation

Prompt: Write a product description for: {{input.product_name}}
Temperature: 0.8

Data Extraction

Prompt: Extract the email and phone from: {{input.text}}
Response Format: json_object
Temperature: 0

Complex Reasoning (with thinking)

Model: claude-3-5-sonnet
Thinking Enabled: true
Thinking Budget: 4096
Prompt: Analyze this code and explain the bug: {{input.code}}

Tips

Use temperature 0 for deterministic outputs like data extraction.
Use temperature 0.7-0.9 for creative writing tasks.
Enable thinking mode for complex reasoning tasks that benefit from step-by-step analysis.
Use OpenRouter to experiment with different models without managing multiple API keys.
API calls cost money. Monitor your usage in your provider’s dashboard.

Error Handling

ErrorCauseSolution
401 UnauthorizedInvalid API keyCheck/update API key
429 Rate LimitedToo many requestsAdd delay, reduce frequency
500 Server ErrorProvider issueRetry later