Architecture
A detailed look at how MachinaOs works under the hood. For contributors, system designers, and the curious. For a navigable tour with hyperlinks, see the DeepWiki page.System Overview
MachinaOs is three loosely-coupled tiers talking over a single persistent WebSocket connection:96 Workflow Nodes
AI, agents, social, Android, documents, Google Workspace, code, proxies, utilities
10 LLM Providers
OpenAI, Anthropic, Gemini, OpenRouter, xAI, DeepSeek, Kimi, Mistral, Groq, Cerebras
15 Specialized Agents
Android, Coding, Web, Task, Social, Travel, Tool, Productivity, Payments, Consumer, Autonomous, Orchestrator, AI Employee, RLM, Claude Code
89 WebSocket Handlers
Push-based updates replace REST polling
49 Built-in Skills
Across 10 categories, DB-backed with SKILL.md defaults
Three Execution Modes
Temporal distributed, Redis parallel, sequential fallback
Execution Engine
Conductor’s Decide Pattern
Workflow orchestration is a single function with fork/join parallelism:ExecutionContext. No shared global state between concurrent runs. Decide loops serialize per execution via Redis SETNX distributed locks.
Three Execution Modes
Layer Computation via Kahn’s Algorithm
Before execution, the DAG is sorted into layers. Layer 0 is the set of nodes with no dependencies; each subsequent layer depends only on earlier layers. Parallel execution runs each layer withasyncio.gather():
androidTool) are detected and excluded from execution layers — they run only when the parent toolkit invokes them via tool calling.
Result Caching, Recovery, and DLQ
- Prefect-style caching: every node result is stored in Redis/SQLite keyed by
hash_inputs(inputs). Re-running an identical node returns the cached result with statusTaskStatus.CACHED. - Heartbeat recovery:
RecoverySweeperscansexecutions:activeevery 60s; nodes with stale heartbeats (> 5 min) are marked stuck and recovered on next startup. - Dead Letter Queue: failed nodes (after retry exhaustion) are quarantined with full input snapshot. Inspect, replay, or purge via the
get_dlq_*/replay_dlq_entry/purge_dlqWebSocket handlers.
Edge Conditions
Edges carry optional conditions for runtime branching with 20+ operators (eq, neq, gt, lt, contains, exists, matches, in, …). Unmatched branches are marked TaskStatus.SKIPPED.
Event-Driven Deployment
Deployments are event-driven: each trigger event spawns an independent concurrent execution run. There is no iteration loop._pre_executed with {not_triggered: True} so they never block as event waiters.
Push vs Polling Triggers
event_waiter module: register a Waiter with a filter closure, suspend on wait_for_event(), resume when an external service dispatches a matching event. Backend supports both in-memory (asyncio.Future) and Redis Streams modes for Temporal multi-worker deployments.
AI Agent System
10 LLM Providers
| Provider | Native SDK Path | LangChain Path | Notes |
|---|---|---|---|
| OpenAI | Yes | Yes | GPT 4.x/5.x + o-series reasoning |
| Anthropic | Yes | Yes | Extended thinking via budget_tokens |
| Gemini | Yes | Yes (fallback) | Native bypasses LangChain Windows hang |
| OpenRouter | Yes | Yes | 200+ models through one API |
| xAI | Yes (shared OpenAIProvider) | Yes | OpenAI-compatible |
| DeepSeek | Yes (shared OpenAIProvider) | Yes | Chat + Reasoner (always-on CoT) |
| Kimi | Yes (shared OpenAIProvider) | Yes | Moonshot K2.5 / K2-thinking |
| Mistral | Yes (shared OpenAIProvider) | Yes | Large / Small / Codestral |
| Groq | No (LangChain only) | Yes | Llama 4, Qwen3, GPT-OSS |
| Cerebras | No (LangChain only) | Yes | Llama, Qwen |
Dual-Path Architecture
LLMResponse dataclass across all providers. The LangChain path is used for agent tool-calling because LangGraph’s checkpointer, state graph, and tool-execution callback layer have no native equivalent today.
15 Specialized Agents
All specialized agents share the same handle architecture (input-main, input-memory, input-skill, input-tools, input-task) and inherit AI_AGENT_PROPERTIES. They only differ in icon, title, theme color, and default prompt.
- 13 agents route to
handle_chat_agent(LangGraph loop, shared code path) rlm_agentroutes tohandle_rlm_agent->RLMService(REPL-based recursive LM)claude_code_agentroutes tohandle_claude_code_agent-> Claude Code SDK
Agent Teams Topology
orchestrator_agent and ai_employee have an extra input-teammates handle. Connected agents become delegate_to_<agent_type> tools automatically:
asyncio.create_task(), the parent continues, and the child broadcasts its own status updates independently. Results can be retrieved via the auto-injected check_delegated_tasks tool or consumed by taskTrigger nodes elsewhere in the workflow.
LangGraph StateGraph Flow
StructuredTool instances with the node’s parameter schema.
Memory, Skills, Tokens, and Cost
Markdown-Based Memory
ThesimpleMemory node stores conversation history as editable Markdown in the parameter panel. The AI Agent handler reads, parses to LangChain messages, executes, appends the new exchange, trims to a window, and archives removed messages to an InMemoryVectorStore (optional) using HuggingFace BAAI/bge-small-en-v1.5 embeddings.
Skill System
49 built-in skills across 10 folders underserver/skills/:
SKILL.md file containing YAML frontmatter (name, description, allowed-tools, metadata) and Markdown instructions. First load seeds the database; after that the database is source of truth so users can edit skill instructions in the UI. “Reset to Default” reloads the original SKILL.md.
The masterSkill node aggregates multiple skills with enable/disable toggles. A split-panel editor shows the skill list on the left and the selected skill’s Markdown on the right. When connected to an agent, the backend expands the skillsConfig parameter into individual skill entries injected into the agent’s system message.
Token Tracking and Compaction
Every AI execution stores aTokenUsageMetric row with input/output/cache/reasoning token counts and calculated costs (USD) based on server/config/pricing.json. Cumulative state per session lives in SessionTokenState.
Compaction threshold priority:
- Per-session
custom_threshold(user-set) - Model-aware: 50% of the model’s context window (e.g., 500K for Claude Opus 4.6 with 1M context)
- Global
COMPACTION_THRESHOLDfallback from.env
CompactionService.compact_context() generates a 5-section summary (Task Overview, Current State, Important Discoveries, Next Steps, Context to Preserve) following the Claude Code pattern and replaces the memory content. Anthropic and OpenAI also have native compaction APIs (context_management edits, compact_threshold) that are configured transparently.
Communication Layer
A single persistent WebSocket at/ws/status handles all frontend-backend communication. There are 89 WebSocket handlers registered with the @ws_handler decorator in server/routers/websocket.py, covering: node parameters, tool schemas, node execution, triggers, dead letter queue, deployment, AI operations, API keys, OAuth flows (Claude, Twitter, Google), Android, WhatsApp, Telegram, workflow storage, chat messages, console logs, skills, memory, user settings, pricing, agent teams, and model registry.
Request/Response Pattern
Broadcast Message Types
Auto-Reconnect and Keepalive
Frontend sends{"type": "ping"} every 30 seconds. On disconnect, the WebSocket context schedules a reconnect after 3 seconds with a 100ms mount delay to avoid React Strict Mode double-connect in development. Connection is gated on isAuthenticated so logged-out users never connect.
Cache, Persistence, and Security
Cache Fallback Hierarchy
MachinaOs follows the n8n cache pattern with automatic environment-based fallback:CacheService in server/core/cache.py checks each backend in order. TTL expiration is supported across all three. A background cleanup_expired_cache() task removes expired SQLite rows.
Encrypted Credentials
API keys and OAuth tokens live in a separate SQLite database (credentials.db) isolated from the main machina.db. Encryption uses Fernet (AES-128-CBC + HMAC-SHA256) with keys derived from a server-scoped config key via PBKDF2HMAC (600K iterations, OWASP 2024 recommendation).
credentials.db:
- API key system (
EncryptedAPIKeytable): provider keys the user enters manually - OAuth token system (
EncryptedOAuthTokentable): tokens from OAuth flows (Google, Twitter, Claude.ai)
AuthService only. Direct database access is forbidden. Decrypted values are cached in AuthService memory dicts and never written to disk or Redis.
For cloud deployments, CREDENTIAL_BACKEND can be switched to keyring (OS-native) or aws (Secrets Manager) via the CredentialBackend abstraction.
Authentication
JWT in HttpOnly cookies following the n8n pattern. Two modes:AUTH_MODE=single- first user becomes owner, registration disabled afterAUTH_MODE=multi- open registration for cloud deploymentsVITE_AUTH_ENABLED=false- bypass login entirely for local development
Related
Node Catalog
Browse all 96 workflow nodes by category
AI Models
10 LLM providers with native SDK and LangChain paths
AI Agents
15 specialized agents with teams and delegation
GitHub
Source code and in-repo
docs-internal/ deep dives