AI Chat Models
MachinaOs supports six AI providers for chat completions, with models fetched dynamically from each provider’s API.Available Providers
| Provider | Models | Best For |
|---|---|---|
| OpenAI | GPT-4o, GPT-4 Turbo, o1, o3, o4-mini | General purpose, reasoning |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku | Coding, analysis, extended thinking |
| Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash Thinking | Multimodal, long context | |
| OpenRouter | 200+ models | Access multiple providers via single API |
| Groq | Llama, Mixtral, Qwen | Ultra-fast inference |
| Cerebras | Llama, Qwen | Ultra-fast on custom hardware |
Adding API Keys
- Click the key icon in the toolbar
- Select the provider
- Enter your API key
- Click Validate to test
API keys are encrypted and stored locally. They’re never sent to MachinaOs servers.
OpenAI Chat Model
Models
| Model | Best For |
|---|---|
| gpt-4o | Most capable, multimodal |
| gpt-4-turbo | Fast, cost-effective GPT-4 |
| o1 | Complex reasoning tasks |
| o3 | Advanced reasoning |
| o4-mini | Fast, efficient reasoning |
Parameters
The model to use
The message to send. Supports template variables.
Randomness (0 = deterministic, 1 = creative)
Maximum response length
Output format: text or json_object
For o-series models: minimal, low, medium, or high reasoning effort
Output
Anthropic Claude Model
Models
| Model | Best For |
|---|---|
| claude-3-5-sonnet-20241022 | Best for coding and complex tasks |
| claude-3-opus-20240229 | Most capable, detailed analysis |
| claude-3-haiku-20240307 | Fast responses, simple tasks |
Parameters
Claude model to use
The message to send
System instructions for the model
Randomness (0-1)
Maximum response length
Enable extended thinking mode (Claude 3.5 Sonnet, Claude 3 Opus)
Token budget for thinking (1024-16000). Shown when thinkingEnabled is true.
Extended Thinking
Claude’s extended thinking mode shows the model’s reasoning process:Google Gemini Model
Models
| Model | Best For |
|---|---|
| gemini-2.5-pro | Most intelligent, complex tasks |
| gemini-2.5-flash | Fast, frontier performance |
| gemini-2.0-flash-thinking | Reasoning with thinking output |
Parameters
Gemini model to use
The message to send
Randomness (0-1)
Maximum response length
Content safety level
Enable thinking mode (Gemini 2.5 models, Flash Thinking)
Output
OpenRouter Model
OpenRouter provides access to 200+ models from multiple providers through a single API.Features
- Unified API: One API key for OpenAI, Anthropic, Google, Meta, Mistral, and more
- Free Models: Some models available at no cost (marked with [FREE] prefix)
- Fallback: Automatic model fallback if primary is unavailable
Models
Models are grouped by cost in the dropdown:- Free models: [FREE] prefix, no cost
- Paid models: Standard pricing per provider
openai/gpt-4oanthropic/claude-3.5-sonnetgoogle/gemini-2.5-prometa-llama/llama-3.1-405b-instructmistralai/mixtral-8x22b-instruct
Parameters
Model in format: provider/model-name
The message to send
Randomness (0-1)
Maximum response length
Output
Groq Model
Groq provides ultra-fast inference on custom LPU (Language Processing Unit) hardware.Models
| Model | Best For |
|---|---|
| llama-3.1-70b-versatile | General purpose, fast |
| llama-3.1-8b-instant | Ultra-fast, simple tasks |
| mixtral-8x7b-32768 | Long context, reasoning |
| qwen3-32b | Reasoning with parsed output |
| qwq-32b | Advanced reasoning |
Parameters
Groq model to use
The message to send
Randomness (0-1)
Maximum response length
For Qwen3/QwQ models: “parsed” returns reasoning, “hidden” returns only final answer
Reasoning Output
Qwen3 and QwQ models support reasoning output:Cerebras Model
Cerebras provides ultra-fast inference on custom wafer-scale AI hardware.Models
| Model | Best For |
|---|---|
| llama3.1-8b | Fast, efficient |
| llama3.1-70b | Capable, balanced |
| qwen-2.5-32b | Reasoning tasks |
Parameters
Cerebras model to use
The message to send
Randomness (0-1)
Maximum response length
Output
Thinking/Reasoning Modes
Several providers support extended thinking or reasoning modes that show the model’s internal reasoning process.| Provider | Models | Parameter |
|---|---|---|
| Claude | 3.5 Sonnet, 3 Opus | thinkingBudget (tokens) |
| Gemini | 2.5 Pro/Flash, Flash Thinking | thinkingBudget (tokens) |
| OpenAI | o1, o3, o4 series | reasoningEffort (level) |
| Groq | Qwen3, QwQ | reasoningFormat (parsed/hidden) |
Using Thinking Output
Thethinking field is available in the node output for downstream nodes:
Comparing Providers
| Feature | OpenAI | Claude | Gemini | OpenRouter | Groq | Cerebras |
|---|---|---|---|---|---|---|
| Speed | Fast | Medium | Fast | Varies | Ultra-fast | Ultra-fast |
| Reasoning | o-series | Extended thinking | Thinking mode | Model-dependent | Qwen3/QwQ | - |
| Context Window | 128K | 200K | 1M+ | Varies | 32K-128K | 128K |
| Multimodal | Yes | Yes | Yes | Model-dependent | No | No |
| JSON Mode | Yes | No | No | Model-dependent | No | No |
Common Use Cases
Text Generation
Data Extraction
Complex Reasoning (with thinking)
Tips
Error Handling
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Check/update API key |
| 429 Rate Limited | Too many requests | Add delay, reduce frequency |
| 500 Server Error | Provider issue | Retry later |